diff mbox series

[RFC,AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp

Message ID 20191230162440.21111-1-luis.machado@linaro.org
State New
Headers show
Series [RFC,AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp | expand

Commit Message

Luis Machado Dec. 30, 2019, 4:24 p.m. UTC
I've been chasing/investigating this particular bug for a while now, and this
is a tentative patch to fix the problem.

Although this may not be aarch64-specific, i can reproduce it consitently in
this particular architecture.

In summary, this bug is a mix of random signal handling (SIGCHLD) and
fork/vfork handling when single-stepping.

Forks and vforks are handled in 3 to 4 steps.

Fork

1 - We get a PTRACE_EVENT_FORK event.
2 - We single-step the inferior to get a SIGTRAP.
3 - Inferior is ready to be resumed as we wish.

Vfork

1 - We get a PTRACE_EVENT_VFORK event.
2 - We continue the inferior to get a PTRACE_EVENT_VFORK_DONE event.
3 - We single-step the inferior to get a SIGTRAP.
4 - Inferior is ready to be resumed as we wish.

The problem manifests itself when we are sitting at a syscall instruction and
we try to instruction-step past it. There may or may not be a breakpoint
inserted at the syscall instruction location.

The expected outcome is that we step a single instruction and the resulting PC
points at the next instruction ($pc + 4 in aarch64).

What i see is that we end up in $pc + 8, that is, one instruction further than
we should've been.

This happens because, when forking/vforking, the child exits too quickly (in the
particular case of gdb.base/step-over-syscall.exp) and a SIGCHLD signal is
issued to the parent. Sometimes this signal arrives before GDB is done handling
the fork/vfork situation. Usually in step 2 for fork and step 3 for vfork.

In turn, this causes GDB to handle that SIGCHLD as a random signal that must be
passed to the inferior, which is correct. But this particular code in infrun.c
doesn't cope with this particular situation and GDB inserts a HP step-resume
breakpoint and registers its wish for the inferior to be stepped yet again.

Thus we end up in $pc + 8, even though we shouldn't have stepped again in this
particular situation.

The delivery of SIGCHLD in between fork/vfork handling steps is sensitive to
timing. If we tweak things a little, enable debugging output or do any other
thing that affects the timing, we may not see this behavior.

This bug doesn't show up as failures in gdb.base/step-over-syscall.exp due to
the way the test is written.  As long as the PC is the same from the previous
test to the next, GDB thinks it is good. The proposed patch doesn't cause a
change in the number of PASSes/FAIL's.

The proposed patch adds a variable waiting_for_fork_sigtrap that gets set just
before the last step of fork/vfork handling and gets reset when we end up
seeing that SIGTRAP we wanted.

This variable is used in the random_signal handling of handle_signal_stop,
where there are a couple conditional blocks that handle signals reaching the
inferior at an unexpected time. If waiting_for_fork_sigtrap is set, we don't
set GDB to do the additional single-step it wants to do.

I gave this a thought and tried to find a better place to put the check in,
but this seemed the most appropriate place to me.

I also tried to solve this without having to introduce a new variable.
But it seems we can't distinguish between a random signal reaching GDB in
the middle of single-stepping and a random signal reaching GDB in the middle
of single-stepping AND handling fork/vfork.

Thoughts?

gdb/ChangeLog:

2019-12-30  Luis Machado  <luis.machado@linaro.org>

	* inferior.h (class inferior) <waiting_for_fork_sigtrap>: New member.
	Set to false by default.
	* infrun.c (handle_inferior_event): Set waiting_for_fork_sigtrap when
	a fork or vfork_done is detected.
	(handle_signal_stop): Don't step again if waiting_for_fork_sigtrap is
	true.
	Reset waiting_for_fork_sigtrap to false when nexti/stepi is detected.

Change-Id: I2849e0dc49ad9c0a026daa8ced4610aa0ddbe637
---
 gdb/inferior.h | 12 ++++++++++++
 gdb/infrun.c   | 31 +++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 2 deletions(-)

-- 
2.17.1

Comments

Luis Machado via Gdb-patches July 16, 2020, 7:44 p.m. UTC | #1
Hi,

For the record, this problem seems to go away with the fix to a 
single-stepping issue I reported. The kernel series is here: 
https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=316353.

Trying the testcase with the patched kernel doesn't show the failures 
anymore.

I'm dropping this patch then. The bug doesn't look harmful enough for us 
to have to work around it in GDB.

On 1/10/20 4:54 PM, Luis Machado wrote:
> On 1/9/20 12:21 PM, Alan Hayward wrote:

>>

>>

>>> On 30 Dec 2019, at 16:24, Luis Machado <luis.machado@linaro.org> wrote:

>>>

>>> I've been chasing/investigating this particular bug for a while now, 

>>> and this

>>> is a tentative patch to fix the problem.

>>>

>>> Although this may not be aarch64-specific, i can reproduce it 

>>> consitently in

>>> this particular architecture.

>>>

>>> In summary, this bug is a mix of random signal handling (SIGCHLD) and

>>> fork/vfork handling when single-stepping.

>>>

>>> Forks and vforks are handled in 3 to 4 steps.

>>>

>>> Fork

>>>

>>> 1 - We get a PTRACE_EVENT_FORK event.

>>> 2 - We single-step the inferior to get a SIGTRAP.

>>> 3 - Inferior is ready to be resumed as we wish.

>>>

>>> Vfork

>>>

>>> 1 - We get a PTRACE_EVENT_VFORK event.

>>> 2 - We continue the inferior to get a PTRACE_EVENT_VFORK_DONE event.

>>> 3 - We single-step the inferior to get a SIGTRAP.

>>> 4 - Inferior is ready to be resumed as we wish.

>>>

>>> The problem manifests itself when we are sitting at a syscall 

>>> instruction and

>>> we try to instruction-step past it. There may or may not be a breakpoint

>>> inserted at the syscall instruction location.

>>>

>>> The expected outcome is that we step a single instruction and the 

>>> resulting PC

>>> points at the next instruction ($pc + 4 in aarch64).

>>>

>>> What i see is that we end up in $pc + 8, that is, one instruction 

>>> further than

>>> we should've been.

>>>

>>> This happens because, when forking/vforking, the child exits too 

>>> quickly (in the

>>> particular case of gdb.base/step-over-syscall.exp) and a SIGCHLD 

>>> signal is

>>> issued to the parent. Sometimes this signal arrives before GDB is 

>>> done handling

>>> the fork/vfork situation. Usually in step 2 for fork and step 3 for 

>>> vfork.

>>>

>>> In turn, this causes GDB to handle that SIGCHLD as a random signal 

>>> that must be

>>> passed to the inferior, which is correct. But this particular code in 

>>> infrun.c

>>> doesn't cope with this particular situation and GDB inserts a HP 

>>> step-resume

>>> breakpoint and registers its wish for the inferior to be stepped yet 

>>> again.

>>>

>>> Thus we end up in $pc + 8, even though we shouldn't have stepped 

>>> again in this

>>> particular situation.

>>

>> I’ve tried debugging this issue in the past, and didn’t figure out the 

>> issue.

>> This _sounds_ like a reasonable explanation of why.

>>

>>>

>>> The delivery of SIGCHLD in between fork/vfork handling steps is 

>>> sensitive to

>>> timing. If we tweak things a little, enable debugging output or do 

>>> any other

>>> thing that affects the timing, we may not see this behavior.

>>>

>>> This bug doesn't show up as failures in 

>>> gdb.base/step-over-syscall.exp due to

>>> the way the test is written.  As long as the PC is the same from the 

>>> previous

>>> test to the next, GDB thinks it is good. The proposed patch doesn't 

>>> cause a

>>> change in the number of PASSes/FAIL's.

>>>

>>

>> Is it possible to add an explicit test to test for the issue? 

>> (possibly as a new

>> .exp file).

> 

> I'll have to think about that. The timing-sensitive nature of this 

> problem is within the kernel. The kernel lets the child process run and 

> then notifies the parent. I can delay the child, but i can't delay the 

> kernel notifying the parent.

> 

> If i could defer the kernel notification to the parent enough so the 

> child exits meanwhile, that would work.

> 

>>

>>

>>> The proposed patch adds a variable waiting_for_fork_sigtrap that gets 

>>> set just

>>> before the last step of fork/vfork handling and gets reset when we 

>>> end up

>>> seeing that SIGTRAP we wanted.

>>>

>>> This variable is used in the random_signal handling of 

>>> handle_signal_stop,

>>> where there are a couple conditional blocks that handle signals 

>>> reaching the

>>> inferior at an unexpected time. If waiting_for_fork_sigtrap is set, 

>>> we don't

>>> set GDB to do the additional single-step it wants to do.

>>>

>>> I gave this a thought and tried to find a better place to put the 

>>> check in,

>>> but this seemed the most appropriate place to me.

>>>

>>> I also tried to solve this without having to introduce a new variable.

>>> But it seems we can't distinguish between a random signal reaching 

>>> GDB in

>>> the middle of single-stepping and a random signal reaching GDB in the 

>>> middle

>>> of single-stepping AND handling fork/vfork.

>>>

>>> Thoughts?

>>>

>>> gdb/ChangeLog:

>>>

>>> 2019-12-30  Luis Machado  <luis.machado@linaro.org>

>>>

>>>     * inferior.h (class inferior) <waiting_for_fork_sigtrap>: New 

>>> member.

>>>     Set to false by default.

>>>     * infrun.c (handle_inferior_event): Set waiting_for_fork_sigtrap 

>>> when

>>>     a fork or vfork_done is detected.

>>>     (handle_signal_stop): Don't step again if 

>>> waiting_for_fork_sigtrap is

>>>     true.

>>>     Reset waiting_for_fork_sigtrap to false when nexti/stepi is 

>>> detected.

>>>

>>> Change-Id: I2849e0dc49ad9c0a026daa8ced4610aa0ddbe637

>>> ---

>>> gdb/inferior.h | 12 ++++++++++++

>>> gdb/infrun.c   | 31 +++++++++++++++++++++++++++++--

>>> 2 files changed, 41 insertions(+), 2 deletions(-)

>>>

>>> diff --git a/gdb/inferior.h b/gdb/inferior.h

>>> index 3bd9e8c3d7..2c23d7302b 100644

>>> --- a/gdb/inferior.h

>>> +++ b/gdb/inferior.h

>>> @@ -447,6 +447,18 @@ public:

>>>       exits or execs.  */

>>>    bool pending_detach = false;

>>>

>>> +  /* True if we are still waiting for the final SIGTRAP when handling

>>> +     fork or vfork events.  This is used in case a signal gets 

>>> caught amidst

>>> +     handling fork and vfork events.

>>> +

>>> +     For fork events we first get a PTRACE_EVENT_FORK, then 

>>> SINGLESTEP the

>>> +     process and then get a SIGTRAP.

>>> +

>>> +     For vfork events we first get a PTRACE_EVENT_VFORK, then we 

>>> CONTINUE

>>> +     the process, get a PTRACE_EVENT_VFORK_DONE event, SINGLESTEP 

>>> the process

>>> +     again and then get a SIGTRAP.  */

>>> +  bool waiting_for_fork_sigtrap = false;

>>> +

>>

>>

>> What happens with a multithreaded process that forks/vforks? Should 

>> this bool be

>> held per thread instead? (A test would be nice to have here).

>>

>>

> 

> I'll have to try that. Unlike waiting for vfork-done, where the parent 

> can't move until the child is done with the shared memory region, 

> waiting for this final SIGTRAP doesn't have this particular restriction.

> 

> Maybe Pedro has a better insight into this particular case.

> 

>>>    /* True if this inferior is a vfork parent waiting for a vfork child

>>>       not under our control to be done with the shared memory region,

>>>       either by exiting or execing.  */

>>> diff --git a/gdb/infrun.c b/gdb/infrun.c

>>> index 7ddd21dd09..c3316a891a 100644

>>> --- a/gdb/infrun.c

>>> +++ b/gdb/infrun.c

>>> @@ -4937,6 +4937,10 @@ Cannot fill $_exitsignal with the correct 

>>> signal number.\n"));

>>>     struct regcache *regcache = get_thread_regcache (ecs->event_thread);

>>>     struct gdbarch *gdbarch = regcache->arch ();

>>>

>>> +    if (ecs->ws.kind == TARGET_WAITKIND_FORKED)

>>> +      /* We should see a SIGTRAP next time we stop.  */

>>> +      current_inferior ()->waiting_for_fork_sigtrap = true;

>>> +

>>>     /* If checking displaced stepping is supported, and thread

>>>        ecs->ptid is displaced stepping.  */

>>>     if (displaced_step_in_progress_thread (ecs->event_thread))

>>> @@ -5094,6 +5098,8 @@ Cannot fill $_exitsignal with the correct 

>>> signal number.\n"));

>>>

>>>        context_switch (ecs);

>>>

>>> +      /* We should see a SIGTRAP next time we stop.  */

>>> +      current_inferior ()->waiting_for_fork_sigtrap = true;

>>>        current_inferior ()->waiting_for_vfork_done = 0;

>>>        current_inferior ()->pspace->breakpoints_not_allowed = 0;

>>>

>>> @@ -5912,7 +5918,15 @@ handle_signal_stop (struct 

>>> execution_control_state *ecs)

>>>                                  "breakpoint\n");

>>>

>>>       insert_hp_step_resume_breakpoint_at_frame (frame);

>>> -      ecs->event_thread->step_after_step_resume_breakpoint = 1;

>>> +

>>> +      /* If we're still waiting for the final fork/vfork SIGTRAP and we

>>> +         got interrupted by a random signal, don't step further after

>>> +         returning from the signal handler.  */

>>> +      if (current_inferior ()->waiting_for_fork_sigtrap)

>>> +        ecs->event_thread->step_after_step_resume_breakpoint = 0;

>>> +      else

>>> +        ecs->event_thread->step_after_step_resume_breakpoint = 1;

>>> +

>>

>> Simpler to write as this?

>> ecs->event_thread->step_after_step_resume_breakpoint = 

>> !current_inferior ()->waiting_for_fork_sigtrap;

>>

>>

> 

> I'll change it.

> 

>>>       /* Reset trap_expected to ensure breakpoints are re-inserted.  */

>>>       ecs->event_thread->control.trap_expected = 0;

>>>

>>> @@ -5947,7 +5961,15 @@ handle_signal_stop (struct 

>>> execution_control_state *ecs)

>>>

>>>       clear_step_over_info ();

>>>       insert_hp_step_resume_breakpoint_at_frame (frame);

>>> -      ecs->event_thread->step_after_step_resume_breakpoint = 1;

>>> +

>>> +      /* If we're still waiting for the final fork/vfork SIGTRAP and we

>>> +         got interrupted by a random signal, don't step further after

>>> +         returning from the signal handler.  */

>>> +      if (current_inferior ()->waiting_for_fork_sigtrap)

>>> +        ecs->event_thread->step_after_step_resume_breakpoint = 0;

>>> +      else

>>> +        ecs->event_thread->step_after_step_resume_breakpoint = 1;

>>> +

>>

>> Ditto.

>>

> 

> I'll change it as well.
diff mbox series

Patch

diff --git a/gdb/inferior.h b/gdb/inferior.h
index 3bd9e8c3d7..2c23d7302b 100644
--- a/gdb/inferior.h
+++ b/gdb/inferior.h
@@ -447,6 +447,18 @@  public:
      exits or execs.  */
   bool pending_detach = false;
 
+  /* True if we are still waiting for the final SIGTRAP when handling
+     fork or vfork events.  This is used in case a signal gets caught amidst
+     handling fork and vfork events.
+
+     For fork events we first get a PTRACE_EVENT_FORK, then SINGLESTEP the
+     process and then get a SIGTRAP.
+
+     For vfork events we first get a PTRACE_EVENT_VFORK, then we CONTINUE
+     the process, get a PTRACE_EVENT_VFORK_DONE event, SINGLESTEP the process
+     again and then get a SIGTRAP.  */
+  bool waiting_for_fork_sigtrap = false;
+
   /* True if this inferior is a vfork parent waiting for a vfork child
      not under our control to be done with the shared memory region,
      either by exiting or execing.  */
diff --git a/gdb/infrun.c b/gdb/infrun.c
index 7ddd21dd09..c3316a891a 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -4937,6 +4937,10 @@  Cannot fill $_exitsignal with the correct signal number.\n"));
 	struct regcache *regcache = get_thread_regcache (ecs->event_thread);
 	struct gdbarch *gdbarch = regcache->arch ();
 
+	if (ecs->ws.kind == TARGET_WAITKIND_FORKED)
+	  /* We should see a SIGTRAP next time we stop.  */
+	  current_inferior ()->waiting_for_fork_sigtrap = true;
+
 	/* If checking displaced stepping is supported, and thread
 	   ecs->ptid is displaced stepping.  */
 	if (displaced_step_in_progress_thread (ecs->event_thread))
@@ -5094,6 +5098,8 @@  Cannot fill $_exitsignal with the correct signal number.\n"));
 
       context_switch (ecs);
 
+      /* We should see a SIGTRAP next time we stop.  */
+      current_inferior ()->waiting_for_fork_sigtrap = true;
       current_inferior ()->waiting_for_vfork_done = 0;
       current_inferior ()->pspace->breakpoints_not_allowed = 0;
 
@@ -5912,7 +5918,15 @@  handle_signal_stop (struct execution_control_state *ecs)
                                 "breakpoint\n");
 
 	  insert_hp_step_resume_breakpoint_at_frame (frame);
-	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
+
+	  /* If we're still waiting for the final fork/vfork SIGTRAP and we
+	     got interrupted by a random signal, don't step further after
+	     returning from the signal handler.  */
+	  if (current_inferior ()->waiting_for_fork_sigtrap)
+	    ecs->event_thread->step_after_step_resume_breakpoint = 0;
+	  else
+	    ecs->event_thread->step_after_step_resume_breakpoint = 1;
+
 	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
 	  ecs->event_thread->control.trap_expected = 0;
 
@@ -5947,7 +5961,15 @@  handle_signal_stop (struct execution_control_state *ecs)
 
 	  clear_step_over_info ();
 	  insert_hp_step_resume_breakpoint_at_frame (frame);
-	  ecs->event_thread->step_after_step_resume_breakpoint = 1;
+
+	  /* If we're still waiting for the final fork/vfork SIGTRAP and we
+	     got interrupted by a random signal, don't step further after
+	     returning from the signal handler.  */
+	  if (current_inferior ()->waiting_for_fork_sigtrap)
+	    ecs->event_thread->step_after_step_resume_breakpoint = 0;
+	  else
+	    ecs->event_thread->step_after_step_resume_breakpoint = 1;
+
 	  /* Reset trap_expected to ensure breakpoints are re-inserted.  */
 	  ecs->event_thread->control.trap_expected = 0;
 	  keep_going (ecs);
@@ -6691,6 +6713,11 @@  process_event_stop_test (struct execution_control_state *ecs)
          one instruction.  */
       if (debug_infrun)
 	 fprintf_unfiltered (gdb_stdlog, "infrun: stepi/nexti\n");
+
+      /* If we were waiting for a final fork/vfork SIGTRAP after a
+	 single-step, this should be it.  Clear the flag.  */
+      if (current_inferior ()->waiting_for_fork_sigtrap)
+	current_inferior ()->waiting_for_fork_sigtrap = false;
       end_stepping_range (ecs);
       return;
     }