Message ID | 20170307155054.5833-1-alex.bennee@linaro.org |
---|---|
Headers | show |
Series | MTTCG fix-ups for 2.9 | expand |
Hi Alex, On Tue, Mar 7, 2017 at 10:50 AM, Alex Bennée <alex.bennee@linaro.org> wrote: > Hi, > > This is the latest iteration of fixes for problems introduced by > MTTCG. Aside from the usual slew of addressing review comments and > applying tags I've also pulled in Yongbok Kim's MIPS patch (to replace > mine) and Paolo's VMEXIT fix for the problem reported by Alexander > Boettcher. > > Unless anyone shouts I'll wrap these up into a pull request for Peter > tomorrow. Can you also pick up this patch? I am not sure who else will. http://patchwork.ozlabs.org/patch/733293/ Thanks, -- Pranith
On Wed, Mar 8, 2017 at 2:39 AM, Alex Bennée <alex.bennee@linaro.org> wrote: > > Pranith Kumar <bobby.prani@gmail.com> writes: > >> Hi Alex, >> >> On Tue, Mar 7, 2017 at 10:50 AM, Alex Bennée <alex.bennee@linaro.org> wrote: >>> Hi, >>> >>> This is the latest iteration of fixes for problems introduced by >>> MTTCG. Aside from the usual slew of addressing review comments and >>> applying tags I've also pulled in Yongbok Kim's MIPS patch (to replace >>> mine) and Paolo's VMEXIT fix for the problem reported by Alexander >>> Boettcher. >>> >>> Unless anyone shouts I'll wrap these up into a pull request for Peter >>> tomorrow. >> >> >> Can you also pick up this patch? I am not sure who else will. >> >> http://patchwork.ozlabs.org/patch/733293/ > > Does the problem only exhibit itself with --accel tcg,thread=multi? I > was figuring it would go in with a larger x86 MTTCG enabling patch set > post 2.9. If it can occur with the defaults now I'll happily add it > to the pullreq I'm preparing. > It happens only with thread=multi, so I guess later on is OK. Thanks, -- Pranith
Hi, > From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] > > The next thing on my list it to look at the icount problems and review > Paolo's fixes for it. However those fixes should go in a separate > series and I assume via Paolo's tree. Do you mean those problems that completely broke icount? Are you going to fix it? > Alex Bennée (9): > vl/cpus: be smarter with icount and MTTCG > target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO > cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG > sparc/sparc64: grab BQL before calling cpu_check_irqs > s390x/misc_helper.c: wrap IO instructions in BQL > target/xtensa: hold BQL for interrupt processing > translate-all: exit cpu_restore_state early if translating > target/arm/helper: make it clear the EC field is also in hex > hw/intc/arm_gic: modernise the DPRINTF > > Paolo Bonzini (1): > target-i386: defer VMEXIT to do_interrupt > > Yongbok Kim (1): > target/mips: hold BQL for timer interrupts Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > Hi, > >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >> >> The next thing on my list it to look at the icount problems and review >> Paolo's fixes for it. However those fixes should go in a separate >> series and I assume via Paolo's tree. > > Do you mean those problems that completely broke icount? Completely broke is a bit strong. Certainly I tested icount on my patches but I missed the pathological failure mode that led to the interaction between the BQL lock breaking patch and running a real guest. It didn't break icount so much as slow down guests so much they never got round to finishing their IRQ handling. That certainly seems to be fixed by Paolo's patches. > Are you going to fix it? Yes. It is certainly not intentional to regress icount and it needs to be fixed before we release 2.9. If you can point me towards the record/replay test cases I'll make sure I run them on the updated series. > > >> Alex Bennée (9): >> vl/cpus: be smarter with icount and MTTCG >> target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO >> cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG >> sparc/sparc64: grab BQL before calling cpu_check_irqs >> s390x/misc_helper.c: wrap IO instructions in BQL >> target/xtensa: hold BQL for interrupt processing >> translate-all: exit cpu_restore_state early if translating >> target/arm/helper: make it clear the EC field is also in hex >> hw/intc/arm_gic: modernise the DPRINTF >> >> Paolo Bonzini (1): >> target-i386: defer VMEXIT to do_interrupt >> >> Yongbok Kim (1): >> target/mips: hold BQL for timer interrupts > > > Pavel Dovgalyuk -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- > request@listserver.greensocs.com] > >> > >> The next thing on my list it to look at the icount problems and review > >> Paolo's fixes for it. However those fixes should go in a separate > >> series and I assume via Paolo's tree. > > > > Do you mean those problems that completely broke icount? > > Completely broke is a bit strong. Certainly I tested icount on my > patches but I missed the pathological failure mode that led to the > interaction between the BQL lock breaking patch and running a real > guest. It didn't break icount so much as slow down guests so much they > never got round to finishing their IRQ handling. That might look as slowing down in regular icount mode. But it becomes non-deterministic in record/replay mode. Therefore none of the recorded scenarios may be replayed. I tried the simplest command lines: qemu-system-i386.exe -icount shift=7,rr=record,rrfile=replay.bin -net none qemu-system-i386.exe -icount shift=7,rr=replay,rrfile=replay.bin -net none First one was used to record execution until BIOS will print its startup info. The second one is for replay, but it can replay about 200000 instructions until the problems with icount manifest itself - the execution does not proceed. > That certainly seems to be fixed by Paolo's patches. > > > Are you going to fix it? > > Yes. It is certainly not intentional to regress icount and it needs to > be fixed before we release 2.9. > > If you can point me towards the record/replay test cases I'll make sure > I run them on the updated series. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> request@listserver.greensocs.com] >> >> >> >> The next thing on my list it to look at the icount problems and review >> >> Paolo's fixes for it. However those fixes should go in a separate >> >> series and I assume via Paolo's tree. >> > >> > Do you mean those problems that completely broke icount? >> >> Completely broke is a bit strong. Certainly I tested icount on my >> patches but I missed the pathological failure mode that led to the >> interaction between the BQL lock breaking patch and running a real >> guest. It didn't break icount so much as slow down guests so much they >> never got round to finishing their IRQ handling. > > That might look as slowing down in regular icount mode. > But it becomes non-deterministic in record/replay mode. > Therefore none of the recorded scenarios may be replayed. > > I tried the simplest command lines: > > qemu-system-i386.exe -icount shift=7,rr=record,rrfile=replay.bin -net > none Running this against 2421f381dc (pre-merge of MTTCG) from the source tree generates no output. My command is on Linux: /x86_64-softmmu/qemu-system-x86_64 -icount shift=7,rr=record,rrfile=replay.bin -net none Do I need to specify the BIOS manually? > > qemu-system-i386.exe -icount shift=7,rr=replay,rrfile=replay.bin -net none > > First one was used to record execution until BIOS will print its startup info. > The second one is for replay, but it can replay about 200000 instructions > until the problems with icount manifest itself - the execution does > not proceed. What error message does it give? How do you know how many instructions have been executed? I ran the following test on both pre-mttcg-merge and my current HEAD which includes Paolo's fix series: ./arm-softmmu/qemu-system-arm -machine type=virt \ -display none -smp 1 -m 4096 -cpu cortex-a15 \ -kernel ../images/aarch32-current-linux-initrd-guest.img -append "console=ttyAMA0" -serial mon:stdio \ -net none \ -icount shift=7,rr=record,rrfile=replay.bin And then: ./arm-softmmu/qemu-system-arm -machine type=virt \ -display none -smp 1 -m 4096 -cpu cortex-a15 \ -kernel ../images/aarch32-current-linux-initrd-guest.img -append "console=ttyAMA0" -serial mon:stdio \ -net none \ -icount shift=7,rr=replay,rrfile=replay.bin And they both ran the same. However I kept running into: [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. qemu-system-arm: Missing character write event in the replay log This seems to be a pre-existing > >> That certainly seems to be fixed by Paolo's patches. >> >> > Are you going to fix it? >> >> Yes. It is certainly not intentional to regress icount and it needs to >> be fixed before we release 2.9. >> >> If you can point me towards the record/replay test cases I'll make sure >> I run them on the updated series. > > > Pavel Dovgalyuk -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- > >> request@listserver.greensocs.com] > >> >> > >> >> The next thing on my list it to look at the icount problems and review > >> >> Paolo's fixes for it. However those fixes should go in a separate > >> >> series and I assume via Paolo's tree. > >> > > >> > Do you mean those problems that completely broke icount? > >> > >> Completely broke is a bit strong. Certainly I tested icount on my > >> patches but I missed the pathological failure mode that led to the > >> interaction between the BQL lock breaking patch and running a real > >> guest. It didn't break icount so much as slow down guests so much they > >> never got round to finishing their IRQ handling. > > > > That might look as slowing down in regular icount mode. > > But it becomes non-deterministic in record/replay mode. > > Therefore none of the recorded scenarios may be replayed. > > > > I tried the simplest command lines: > > > > qemu-system-i386.exe -icount shift=7,rr=record,rrfile=replay.bin -net > > none > > Running this against 2421f381dc (pre-merge of MTTCG) from the source > tree generates no output. My command is on Linux: > > /x86_64-softmmu/qemu-system-x86_64 -icount shift=7,rr=record,rrfile=replay.bin -net none > > Do I need to specify the BIOS manually? No, this command line works well for me. BIOS executes and shows some messages. Do you have any graphical output for QEMU? > > qemu-system-i386.exe -icount shift=7,rr=replay,rrfile=replay.bin -net none > > > > First one was used to record execution until BIOS will print its startup info. > > The second one is for replay, but it can replay about 200000 instructions > > until the problems with icount manifest itself - the execution does > > not proceed. > > What error message does it give? How do you know how many instructions > have been executed? It hangs after executing about 200000 instructions. I checked -d exec logs. > I ran the following test on both pre-mttcg-merge and my current HEAD > which includes Paolo's fix series: > > ./arm-softmmu/qemu-system-arm -machine type=virt \ > -display none -smp 1 -m 4096 -cpu cortex-a15 \ > -kernel ../images/aarch32-current-linux-initrd-guest.img > -append "console=ttyAMA0" -serial mon:stdio \ > -net none \ > -icount shift=7,rr=record,rrfile=replay.bin > > And then: > > ./arm-softmmu/qemu-system-arm -machine type=virt \ > -display none -smp 1 -m 4096 -cpu cortex-a15 \ > -kernel ../images/aarch32-current-linux-initrd-guest.img > -append "console=ttyAMA0" -serial mon:stdio \ > -net none \ > -icount shift=7,rr=replay,rrfile=replay.bin > > And they both ran the same. However I kept running into: > > [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. > qemu-system-arm: Missing character write event in the replay log > > This seems to be a pre-existing Does it mean that qemu-arm platform includes some serial devices that were not detected by the replay? Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> >> request@listserver.greensocs.com] >> >> >> >> >> >> The next thing on my list it to look at the icount problems and review >> >> >> Paolo's fixes for it. However those fixes should go in a separate >> >> >> series and I assume via Paolo's tree. >> >> > >> >> > Do you mean those problems that completely broke icount? >> >> >> >> Completely broke is a bit strong. Certainly I tested icount on my >> >> patches but I missed the pathological failure mode that led to the >> >> interaction between the BQL lock breaking patch and running a real >> >> guest. It didn't break icount so much as slow down guests so much they >> >> never got round to finishing their IRQ handling. >> > >> > That might look as slowing down in regular icount mode. >> > But it becomes non-deterministic in record/replay mode. >> > Therefore none of the recorded scenarios may be replayed. >> > >> > I tried the simplest command lines: >> > >> > qemu-system-i386.exe -icount shift=7,rr=record,rrfile=replay.bin -net >> > none >> >> Running this against 2421f381dc (pre-merge of MTTCG) from the source >> tree generates no output. My command is on Linux: >> >> /x86_64-softmmu/qemu-system-x86_64 -icount shift=7,rr=record,rrfile=replay.bin -net none >> >> Do I need to specify the BIOS manually? > > No, this command line works well for me. BIOS executes and shows some messages. > Do you have any graphical output for QEMU? Ahh I was running remotely. OK I can now get output on the graphics display and the described hang. I'll have a look at it. > >> > qemu-system-i386.exe -icount shift=7,rr=replay,rrfile=replay.bin -net none >> > >> > First one was used to record execution until BIOS will print its startup info. >> > The second one is for replay, but it can replay about 200000 instructions >> > until the problems with icount manifest itself - the execution does >> > not proceed. >> >> What error message does it give? How do you know how many instructions >> have been executed? > > It hangs after executing about 200000 instructions. > I checked -d exec logs. > >> I ran the following test on both pre-mttcg-merge and my current HEAD >> which includes Paolo's fix series: >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> -append "console=ttyAMA0" -serial mon:stdio \ >> -net none \ >> -icount shift=7,rr=record,rrfile=replay.bin >> >> And then: >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> -append "console=ttyAMA0" -serial mon:stdio \ >> -net none \ >> -icount shift=7,rr=replay,rrfile=replay.bin >> >> And they both ran the same. However I kept running into: >> >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. >> qemu-system-arm: Missing character write event in the replay log >> >> This seems to be a pre-existing > > Does it mean that qemu-arm platform includes some serial devices that were > not detected by the replay? It's the standard ARM platform serial port. When I read replay.txt is said: * Supports i386, x86_64, and ARM hardware platforms. Should we add some qualifications about which machine types are supported? What is you ARM test case for record/replay? -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > > > >> I ran the following test on both pre-mttcg-merge and my current HEAD > >> which includes Paolo's fix series: > >> > >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> -append "console=ttyAMA0" -serial mon:stdio \ > >> -net none \ > >> -icount shift=7,rr=record,rrfile=replay.bin > >> > >> And then: > >> > >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> -append "console=ttyAMA0" -serial mon:stdio \ > >> -net none \ > >> -icount shift=7,rr=replay,rrfile=replay.bin > >> > >> And they both ran the same. However I kept running into: > >> > >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. > >> qemu-system-arm: Missing character write event in the replay log > >> > >> This seems to be a pre-existing > > > > Does it mean that qemu-arm platform includes some serial devices that were > > not detected by the replay? > > It's the standard ARM platform serial port. When I read replay.txt is > said: > > * Supports i386, x86_64, and ARM hardware platforms. > > Should we add some qualifications about which machine types are > supported? What is you ARM test case for record/replay? I tested on vexpress-a9 platform with Debian wheezy. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> > >> >> I ran the following test on both pre-mttcg-merge and my current HEAD >> >> which includes Paolo's fix series: >> >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> >> -append "console=ttyAMA0" -serial mon:stdio \ >> >> -net none \ >> >> -icount shift=7,rr=record,rrfile=replay.bin >> >> >> >> And then: >> >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> >> -append "console=ttyAMA0" -serial mon:stdio \ >> >> -net none \ >> >> -icount shift=7,rr=replay,rrfile=replay.bin >> >> >> >> And they both ran the same. However I kept running into: >> >> >> >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. >> >> qemu-system-arm: Missing character write event in the replay log >> >> >> >> This seems to be a pre-existing >> > >> > Does it mean that qemu-arm platform includes some serial devices that were >> > not detected by the replay? >> >> It's the standard ARM platform serial port. When I read replay.txt is >> said: >> >> * Supports i386, x86_64, and ARM hardware platforms. >> >> Should we add some qualifications about which machine types are >> supported? What is you ARM test case for record/replay? > > I tested on vexpress-a9 platform with Debian wheezy. Thanks for that. I now have a test case that I can reproduce failures on without needing graphics. I've been investigating if there are any problems with the timer processing now they have been moved into the TCG thread. The record stage seems to work fine but I'm having difficulty figuring out why playback freezes. It seems we get to a point where we are stuck waiting for a suspiciously exact timer deadline: replay_account_executed_instructions: instructions_count reached zero handle_icount_deadline: deadline=471077745280 replay_account_executed_instructions: instructions_count reached zero handle_icount_deadline: deadline=10000000 [Switching to Thread 0x7fff7a7be700 (LWP 733)] Thread 3 "qemu-system-arm" hit Breakpoint 1, handle_icount_deadline () at /home/alex/lsrc/qemu/qemu.git/cpus.c:1184 1184 fprintf(stderr,"%s: no change to deadline=%ld for %ld\n", (gdb) c Continuing. handle_icount_deadline: no change to deadline=10000000 for 11 Thread 3 "qemu-system-arm" hit Breakpoint 1, handle_icount_deadline () at /home/alex/lsrc/qemu/qemu.git/cpus.c:1184 1184 fprintf(stderr,"%s: no change to deadline=%ld for %ld\n", (gdb) c 10 Will ignore next 9 crossings of breakpoint 1. Continuing. handle_icount_deadline: no change to deadline=10000000 for 12 handle_icount_deadline: no change to deadline=10000000 for 13 handle_icount_deadline: no change to deadline=10000000 for 14 handle_icount_deadline: no change to deadline=10000000 for 15 handle_icount_deadline: no change to deadline=10000000 for 16 handle_icount_deadline: no change to deadline=10000000 for 17 handle_icount_deadline: no change to deadline=10000000 for 18 handle_icount_deadline: no change to deadline=10000000 for 19 handle_icount_deadline: no change to deadline=10000000 for 20 handle_icount_deadline: no change to deadline=10000000 for 21 (gdb) p replay_state $1 = {cached_clock = {1490191319270134000, 0}, current_step = 267869922, instructions_count = 0, data_kind = 12, has_unread_data = 1, file_offset = 0, block_request_id = 0} But the timers are all enabled: (gdb) qemu timers Processing Realtime timers clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 Processing Virtual timers clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) Processing Host timers clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 Processing Virtual RT timers clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 One area I wanted to look back at was comparing the record trace from pre-mttcg-merge to now to see if any information was missing. However the bin file has quite a lot of noise in it from changing fields so I was wondering do you have any sort of dumper tool for comparing the traces? If not is the format of the trace file specified anywhere? I found: #define REPLAY_VERSION 0xe02005 but couldn't find any more than that. Is the format basically just described by the calls to replay_put_byte? -- Alex Bennée
> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > > >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> > > >> >> I ran the following test on both pre-mttcg-merge and my current HEAD > >> >> which includes Paolo's fix series: > >> >> > >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> >> -append "console=ttyAMA0" -serial mon:stdio \ > >> >> -net none \ > >> >> -icount shift=7,rr=record,rrfile=replay.bin > >> >> > >> >> And then: > >> >> > >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> >> -append "console=ttyAMA0" -serial mon:stdio \ > >> >> -net none \ > >> >> -icount shift=7,rr=replay,rrfile=replay.bin > >> >> > >> >> And they both ran the same. However I kept running into: > >> >> > >> >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. > >> >> qemu-system-arm: Missing character write event in the replay log > >> >> > >> >> This seems to be a pre-existing > >> > > >> > Does it mean that qemu-arm platform includes some serial devices that were > >> > not detected by the replay? > >> > >> It's the standard ARM platform serial port. When I read replay.txt is > >> said: > >> > >> * Supports i386, x86_64, and ARM hardware platforms. > >> > >> Should we add some qualifications about which machine types are > >> supported? What is you ARM test case for record/replay? > > > > I tested on vexpress-a9 platform with Debian wheezy. > > Thanks for that. I now have a test case that I can reproduce failures on > without needing graphics. > > I've been investigating if there are any problems with the timer > processing now they have been moved into the TCG thread. The record > stage seems to work fine but I'm having difficulty figuring out why > playback freezes. It seems we get to a point where we are stuck waiting > for a suspiciously exact timer deadline: I've encountered the following behavior at replay stage: - replay takes some instructions to execute (qemu_icount += counter) - virtual timer is fired - replay puts back unexecuted instructions (qemu_icount -= counter) But virtual timer cannot take in account non-executed instructions (counter) and therefore it reads only qemu_icount, which provides incorrect time. > But the timers are all enabled: > > (gdb) qemu timers > Processing Realtime timers > clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 > Processing Virtual timers > clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 > timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) > timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) > Processing Host timers > clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 > Processing Virtual RT timers > clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 Timers are processed only at checkpoints recorded in the log. When replay is stuck, probably there is a pending checkpoint in the log and pending instructions in CPU (because iothread breaks its synchronization). > One area I wanted to look back at was comparing the record trace from > pre-mttcg-merge to now to see if any information was missing. However I usually use in_asm and exec logs and also add some logging at replay checkpoints. > the bin file has quite a lot of noise in it from changing fields so I > was wondering do you have any sort of dumper tool for comparing the > traces? If not is the format of the trace file specified anywhere? Usually you cannot compare two different record/replay logs, because there is no synchronization of the timers in different recording runs and therefore you'll get totally different logs. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> > >> >> >> I ran the following test on both pre-mttcg-merge and my current HEAD >> >> >> which includes Paolo's fix series: >> >> >> >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> >> >> -append "console=ttyAMA0" -serial mon:stdio \ >> >> >> -net none \ >> >> >> -icount shift=7,rr=record,rrfile=replay.bin >> >> >> >> >> >> And then: >> >> >> >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ >> >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ >> >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img >> >> >> -append "console=ttyAMA0" -serial mon:stdio \ >> >> >> -net none \ >> >> >> -icount shift=7,rr=replay,rrfile=replay.bin >> >> >> >> >> >> And they both ran the same. However I kept running into: >> >> >> >> >> >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. >> >> >> qemu-system-arm: Missing character write event in the replay log >> >> >> >> >> >> This seems to be a pre-existing >> >> > >> >> > Does it mean that qemu-arm platform includes some serial devices that were >> >> > not detected by the replay? >> >> >> >> It's the standard ARM platform serial port. When I read replay.txt is >> >> said: >> >> >> >> * Supports i386, x86_64, and ARM hardware platforms. >> >> >> >> Should we add some qualifications about which machine types are >> >> supported? What is you ARM test case for record/replay? >> > >> > I tested on vexpress-a9 platform with Debian wheezy. >> >> Thanks for that. I now have a test case that I can reproduce failures on >> without needing graphics. >> >> I've been investigating if there are any problems with the timer >> processing now they have been moved into the TCG thread. The record >> stage seems to work fine but I'm having difficulty figuring out why >> playback freezes. It seems we get to a point where we are stuck waiting >> for a suspiciously exact timer deadline: > > I've encountered the following behavior at replay stage: > - replay takes some instructions to execute (qemu_icount += counter) > - virtual timer is fired This is the virtual timer based on icount not the virtual rt timer? So under the new scheme of being processed in the vCPU loop we should only fire this one once all execution is done (although you may exit the loop early before icount is exhausted). > - replay puts back unexecuted instructions (qemu_icount -= counter) > > But virtual timer cannot take in account non-executed instructions (counter) and > therefore it reads only qemu_icount, which provides incorrect time. > >> But the timers are all enabled: >> >> (gdb) qemu timers >> Processing Realtime timers >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 >> Processing Virtual timers >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 >> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) >> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) >> Processing Host timers >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 >> Processing Virtual RT timers >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 > > Timers are processed only at checkpoints recorded in the log. > When replay is stuck, probably there is a pending checkpoint in the log > and pending instructions in CPU (because iothread breaks its > synchronization). How does this work for instructions that sleep (like WFI)? The WFI will cause us to exit the loop but I assume we need a "real" timer to trigger a firing of whatever the WFI is waiting for as time according to icount isn't advancing. > >> One area I wanted to look back at was comparing the record trace from >> pre-mttcg-merge to now to see if any information was missing. However > > I usually use in_asm and exec logs and also add some logging at replay checkpoints. > >> the bin file has quite a lot of noise in it from changing fields so I >> was wondering do you have any sort of dumper tool for comparing the >> traces? If not is the format of the trace file specified anywhere? > > Usually you cannot compare two different record/replay logs, because there > is no synchronization of the timers in different recording runs and > therefore you'll get totally different logs. I wasn't looking to have identical traces but I was interested in the patterns in the logs and how they have changed. For example: HEADER: version 0xe02005 | HEADER: version 0xe02006 1:EVENT_CP_INIT 1:EVENT_CP_INIT 2:EVENT_CLOCK_HOST 0x14afc6687f1fc738 | 2:EVENT_CLOCK_HOST 0x14b000644239c8d8 3:EVENT_CLOCK_HOST 0x14afc66c6ac91858 | 3:EVENT_CLOCK_HOST 0x14b00064423a1310 4:EVENT_CP_RESET no additional data 4:EVENT_CP_RESET no additional data 5:EVENT_CP_CLOCK_WARP_START no additional data 5:EVENT_CP_CLOCK_WARP_START no additional data 6:EVENT_CP_CLOCK_VIRTUAL no additional data | 6:EVENT_CP_CLOCK_WARP_START no additional data 7:EVENT_CP_CLOCK_WARP_START no additional data | 7:EVENT_CP_CLOCK_VIRTUAL no additional data 8:EVENT_CP_CLOCK_VIRTUAL no additional data | 8:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 9:EVENT_CP_CLOCK_WARP_START no additional data | 9:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 10:EVENT_CP_CLOCK_VIRTUAL no additional data | 10:EVENT_INSTRUCTION 0x1000000 11:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 11:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 12:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data | 12:EVENT_INSTRUCTION 0x1000000 13:EVENT_INSTRUCTION 0x1000000 | 13:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data There is a difference to the pattern at the start as the initial checkpoints are layed down. I guess this is due to the interaction of the main thread with the vCPU as they bounce the BQL with each other. I'm wondering why we don't just do all the warp timer management in the vCPU thread as well? -- Alex Bennée
> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- > request@listserver.greensocs.com] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> > >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> >> > > >> >> >> I ran the following test on both pre-mttcg-merge and my current HEAD > >> >> >> which includes Paolo's fix series: > >> >> >> > >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> >> >> -append "console=ttyAMA0" -serial mon:stdio \ > >> >> >> -net none \ > >> >> >> -icount shift=7,rr=record,rrfile=replay.bin > >> >> >> > >> >> >> And then: > >> >> >> > >> >> >> ./arm-softmmu/qemu-system-arm -machine type=virt \ > >> >> >> -display none -smp 1 -m 4096 -cpu cortex-a15 \ > >> >> >> -kernel ../images/aarch32-current-linux-initrd-guest.img > >> >> >> -append "console=ttyAMA0" -serial mon:stdio \ > >> >> >> -net none \ > >> >> >> -icount shift=7,rr=replay,rrfile=replay.bin > >> >> >> > >> >> >> And they both ran the same. However I kept running into: > >> >> >> > >> >> >> [ 3.542408] RPC: Registered tcp NFSv4.1 backchannel transport module. > >> >> >> qemu-system-arm: Missing character write event in the replay log > >> >> >> > >> >> >> This seems to be a pre-existing > >> >> > > >> >> > Does it mean that qemu-arm platform includes some serial devices that were > >> >> > not detected by the replay? > >> >> > >> >> It's the standard ARM platform serial port. When I read replay.txt is > >> >> said: > >> >> > >> >> * Supports i386, x86_64, and ARM hardware platforms. > >> >> > >> >> Should we add some qualifications about which machine types are > >> >> supported? What is you ARM test case for record/replay? > >> > > >> > I tested on vexpress-a9 platform with Debian wheezy. > >> > >> Thanks for that. I now have a test case that I can reproduce failures on > >> without needing graphics. > >> > >> I've been investigating if there are any problems with the timer > >> processing now they have been moved into the TCG thread. The record > >> stage seems to work fine but I'm having difficulty figuring out why > >> playback freezes. It seems we get to a point where we are stuck waiting > >> for a suspiciously exact timer deadline: > > > > I've encountered the following behavior at replay stage: > > - replay takes some instructions to execute (qemu_icount += counter) > > - virtual timer is fired > > This is the virtual timer based on icount not the virtual rt timer? So Regular virtual timer. It's value is based on icount. virtual_rt is used for internal icount purposes. > under the new scheme of being processed in the vCPU loop we should only > fire this one once all execution is done (although you may exit the loop > early before icount is exhausted). We should stop the vCPU before processing virtual timers because: - virtual clock depends on instruction counter - virtual timers may change virtual hardware state > > - replay puts back unexecuted instructions (qemu_icount -= counter) > > > > But virtual timer cannot take in account non-executed instructions (counter) and > > therefore it reads only qemu_icount, which provides incorrect time. > > > >> But the timers are all enabled: > >> > >> (gdb) qemu timers > >> Processing Realtime timers > >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 > >> Processing Virtual timers > >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 > >> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) > >> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) > >> Processing Host timers > >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 > >> Processing Virtual RT timers > >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 > > > > Timers are processed only at checkpoints recorded in the log. > > When replay is stuck, probably there is a pending checkpoint in the log > > and pending instructions in CPU (because iothread breaks its > > synchronization). > > How does this work for instructions that sleep (like WFI)? The WFI will > cause us to exit the loop but I assume we need a "real" timer to trigger > a firing of whatever the WFI is waiting for as time according to icount > isn't advancing. This is virtual_rt for. It advances even when vCPU is stopped. > >> One area I wanted to look back at was comparing the record trace from > >> pre-mttcg-merge to now to see if any information was missing. However > > > > I usually use in_asm and exec logs and also add some logging at replay checkpoints. > > > >> the bin file has quite a lot of noise in it from changing fields so I > >> was wondering do you have any sort of dumper tool for comparing the > >> traces? If not is the format of the trace file specified anywhere? > > > > Usually you cannot compare two different record/replay logs, because there > > is no synchronization of the timers in different recording runs and > > therefore you'll get totally different logs. > > I wasn't looking to have identical traces but I was interested in the > patterns in the logs and how they have changed. For example: > > HEADER: version 0xe02005 | HEADER: version 0xe02006 > 1:EVENT_CP_INIT 1:EVENT_CP_INIT > 2:EVENT_CLOCK_HOST 0x14afc6687f1fc738 | 2:EVENT_CLOCK_HOST > 0x14b000644239c8d8 > 3:EVENT_CLOCK_HOST 0x14afc66c6ac91858 | 3:EVENT_CLOCK_HOST > 0x14b00064423a1310 > 4:EVENT_CP_RESET no additional data 4:EVENT_CP_RESET no additional > data > 5:EVENT_CP_CLOCK_WARP_START no additional data 5:EVENT_CP_CLOCK_WARP_START no > additional data > 6:EVENT_CP_CLOCK_VIRTUAL no additional data | 6:EVENT_CP_CLOCK_WARP_START no > additional data > 7:EVENT_CP_CLOCK_WARP_START no additional data | 7:EVENT_CP_CLOCK_VIRTUAL no > additional data > 8:EVENT_CP_CLOCK_VIRTUAL no additional data | 8:EVENT_CP_CLOCK_WARP_ACCOUNT > no additional data > 9:EVENT_CP_CLOCK_WARP_START no additional data | 9:EVENT_CP_CLOCK_WARP_ACCOUNT > no additional data > 10:EVENT_CP_CLOCK_VIRTUAL no additional data | 10:EVENT_INSTRUCTION 0x1000000 > 11:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 11:EVENT_CP_CLOCK_WARP_ACCOUNT > no additional data > 12:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data | 12:EVENT_INSTRUCTION 0x1000000 > 13:EVENT_INSTRUCTION 0x1000000 | 13:EVENT_CP_CLOCK_WARP_ACCOUNT > no additional data > > There is a difference to the pattern at the start as the initial > checkpoints are layed down. I guess this is due to the interaction of > the main thread with the vCPU as they bounce the BQL with each other. Yes, this may be caused by changed interaction between the threads. > I'm wondering why we don't just do all the warp timer management in the > vCPU thread as well? Because vCPU may sleep and we'll have to setup some timer (virtual_rt as now) anyway. By the way, I made small description of the replay log format to be later included into docs/replay.txt: Record/replay log consits of the header and the sequence of execution events. The header includes 4-byte replay version id and 8-byte reserved field. Version is updated every time replay log format changes to prevent using replay log created by another build of qemu. The sequence of the events describes virtual machine state changes. It includes all non-deterministic inputs of VM, synchronization marks and instruction counts used to correctly inject inputs at replay. Synchronization marks (checkpoints) are used for synchronizing qemu threads that perform operations with virtual hardware. These operations may change system's state (e.g., change some register or generate interrupt) and therefore should execute synchronously with CPU thread. Every event in the log includes 1-byte event id and optional arguments. When argument is an array, it is stored as 4-byte array length and corresponding number of bytes with data. Here is the list of events that are written into the log: - EVENT_INSTRUCTION. Instructions executed since last event. Argument: 4-byte number of executed instructions. - EVENT_INTERRUPT. Used to synchronize interrupt processing. - EVENT_EXCEPTION. Used to synchronize exception handling. - EVENT_ASYNC. This is a group of events. They are always processed together with checkpoints. When such an event is generated, it is stored in the queue and processed only when checkpoint occurs. Every such event is followed by 1-byte checkpoint id and 1-byte async event id from the following list: - REPLAY_ASYNC_EVENT_BH. Bottom-half callback. This event synchronizes callbacks that affect virtual machine state, but normally called asyncronously. Argument: 8-byte operation id. - REPLAY_ASYNC_EVENT_INPUT. Input device event. Contains parameters of keyboard and mouse input operations (key press/release, mouse pointer movement). Arguments: 9-16 bytes depending of input event. - REPLAY_ASYNC_EVENT_INPUT_SYNC. Internal input synchronization event. - REPLAY_ASYNC_EVENT_CHAR_READ. Character (e.g., serial port) device input initiated by the sender. Arguments: 1-byte character device id. Array with bytes were read. - REPLAY_ASYNC_EVENT_BLOCK. Block device operation. Used to synchronize operations with disk and flash drives with CPU. Argument: 8-byte operation id. - REPLAY_ASYNC_EVENT_NET. Incoming network packet. Arguments: 1-byte network adapter id. 4-byte packet flags. Array with packet bytes. - EVENT_SHUTDOWN. Occurs when user sends shutdown event to qemu, e.g., by closing the window. - EVENT_CHAR_WRITE. Used to synchronize character output operations. Arguments: 4-byte output function return value. 4-byte offset in the output array. - EVENT_CHAR_READ_ALL. Used to synchronize character input operations, initiated by qemu. Argument: Array with bytes that were read. - EVENT_CHAR_READ_ALL_ERROR. Unsuccessful character input operation, initiated by qemu. Argument: 4-byte error code. - EVENT_CLOCK + clock_id. Group of events for host clock read operations. Argument: 8-byte clock value. - EVENT_CHECKPOINT + checkpoint_id. Checkpoint for synchronization of CPU, internal threads, and asynchronous input events. May be followed by one or more EVENT_ASYNC events. - EVENT_END. Last event in the log. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> request@listserver.greensocs.com] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: <snip> >> >> > I tested on vexpress-a9 platform with Debian wheezy. >> >> >> >> Thanks for that. I now have a test case that I can reproduce failures on >> >> without needing graphics. >> >> >> >> I've been investigating if there are any problems with the timer >> >> processing now they have been moved into the TCG thread. The record >> >> stage seems to work fine but I'm having difficulty figuring out why >> >> playback freezes. It seems we get to a point where we are stuck waiting >> >> for a suspiciously exact timer deadline: >> > >> > I've encountered the following behavior at replay stage: >> > - replay takes some instructions to execute (qemu_icount += counter) >> > - virtual timer is fired >> >> This is the virtual timer based on icount not the virtual rt timer? So > > Regular virtual timer. It's value is based on icount. > virtual_rt is used for internal icount purposes. And this is where the clock warps come in? The bias brings the virtual_rt time forward because execution is waiting for some external event to fire (e.g. a timer IRQ)? >> under the new scheme of being processed in the vCPU loop we should only >> fire this one once all execution is done (although you may exit the loop >> early before icount is exhausted). > > We should stop the vCPU before processing virtual timers because: > - virtual clock depends on instruction counter > - virtual timers may change virtual hardware state If we do the processing in the top of main vCPU loop this should be equivalent. The instruction counter cannot run because we haven't entered tcg_exec_cpu. We also process virtual timers in this thread outside the loop so nothing else can be poking the hardware state. >> > - replay puts back unexecuted instructions (qemu_icount -= counter) >> > >> > But virtual timer cannot take in account non-executed instructions (counter) and >> > therefore it reads only qemu_icount, which provides incorrect time. >> > >> >> But the timers are all enabled: >> >> >> >> (gdb) qemu timers >> >> Processing Realtime timers >> >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 >> >> Processing Virtual timers >> >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 >> >> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) >> >> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) >> >> Processing Host timers >> >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 >> >> Processing Virtual RT timers >> >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 >> > >> > Timers are processed only at checkpoints recorded in the log. >> > When replay is stuck, probably there is a pending checkpoint in the log >> > and pending instructions in CPU (because iothread breaks its >> > synchronization). >> >> How does this work for instructions that sleep (like WFI)? The WFI will >> cause us to exit the loop but I assume we need a "real" timer to trigger >> a firing of whatever the WFI is waiting for as time according to icount >> isn't advancing. > > This is virtual_rt for. It advances even when vCPU is stopped. So I'm not seeing this happen: process_icount_data: icount not increased 117302332749=>117302332749 RT:-1 I wonder if the advance of virtual_rt has been deadlocked? The -1 suggests its not even setup. >> >> One area I wanted to look back at was comparing the record trace from >> >> pre-mttcg-merge to now to see if any information was missing. However >> > >> > I usually use in_asm and exec logs and also add some logging at replay checkpoints. >> > >> >> the bin file has quite a lot of noise in it from changing fields so I >> >> was wondering do you have any sort of dumper tool for comparing the >> >> traces? If not is the format of the trace file specified anywhere? >> > >> > Usually you cannot compare two different record/replay logs, because there >> > is no synchronization of the timers in different recording runs and >> > therefore you'll get totally different logs. >> >> I wasn't looking to have identical traces but I was interested in the >> patterns in the logs and how they have changed. For example: >> >> HEADER: version 0xe02005 | HEADER: version 0xe02006 >> 1:EVENT_CP_INIT 1:EVENT_CP_INIT >> 2:EVENT_CLOCK_HOST 0x14afc6687f1fc738 | 2:EVENT_CLOCK_HOST >> 0x14b000644239c8d8 >> 3:EVENT_CLOCK_HOST 0x14afc66c6ac91858 | 3:EVENT_CLOCK_HOST >> 0x14b00064423a1310 >> 4:EVENT_CP_RESET no additional data 4:EVENT_CP_RESET no additional >> data >> 5:EVENT_CP_CLOCK_WARP_START no additional data 5:EVENT_CP_CLOCK_WARP_START no >> additional data >> 6:EVENT_CP_CLOCK_VIRTUAL no additional data | 6:EVENT_CP_CLOCK_WARP_START no >> additional data >> 7:EVENT_CP_CLOCK_WARP_START no additional data | 7:EVENT_CP_CLOCK_VIRTUAL no >> additional data >> 8:EVENT_CP_CLOCK_VIRTUAL no additional data | 8:EVENT_CP_CLOCK_WARP_ACCOUNT >> no additional data >> 9:EVENT_CP_CLOCK_WARP_START no additional data | 9:EVENT_CP_CLOCK_WARP_ACCOUNT >> no additional data >> 10:EVENT_CP_CLOCK_VIRTUAL no additional data | 10:EVENT_INSTRUCTION 0x1000000 >> 11:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data 11:EVENT_CP_CLOCK_WARP_ACCOUNT >> no additional data >> 12:EVENT_CP_CLOCK_WARP_ACCOUNT no additional data | 12:EVENT_INSTRUCTION 0x1000000 >> 13:EVENT_INSTRUCTION 0x1000000 | 13:EVENT_CP_CLOCK_WARP_ACCOUNT >> no additional data >> >> There is a difference to the pattern at the start as the initial >> checkpoints are layed down. I guess this is due to the interaction of >> the main thread with the vCPU as they bounce the BQL with each other. > > Yes, this may be caused by changed interaction between the threads. > >> I'm wondering why we don't just do all the warp timer management in the >> vCPU thread as well? > > Because vCPU may sleep and we'll have to setup some timer (virtual_rt > as now) anyway. Right - because if we end up qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus)); we need another thread to kick us out of the qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex); > By the way, I made small description of the replay log format > to be later included into docs/replay.txt: > > Record/replay log consits of the header and the sequence of execution > events. The header includes 4-byte replay version id and 8-byte reserved > field. Version is updated every time replay log format changes to prevent > using replay log created by another build of qemu. > > The sequence of the events describes virtual machine state changes. > It includes all non-deterministic inputs of VM, synchronization marks and > instruction counts used to correctly inject inputs at replay. > > Synchronization marks (checkpoints) are used for synchronizing qemu threads > that perform operations with virtual hardware. These operations may change > system's state (e.g., change some register or generate interrupt) and > therefore should execute synchronously with CPU thread. > > Every event in the log includes 1-byte event id and optional arguments. > When argument is an array, it is stored as 4-byte array length > and corresponding number of bytes with data. > Here is the list of events that are written into the log: > > - EVENT_INSTRUCTION. Instructions executed since last event. > Argument: 4-byte number of executed instructions. > - EVENT_INTERRUPT. Used to synchronize interrupt processing. > - EVENT_EXCEPTION. Used to synchronize exception handling. > - EVENT_ASYNC. This is a group of events. They are always processed > together with checkpoints. When such an event is generated, it is > stored in the queue and processed only when checkpoint occurs. > Every such event is followed by 1-byte checkpoint id and 1-byte > async event id from the following list: > - REPLAY_ASYNC_EVENT_BH. Bottom-half callback. This event synchronizes > callbacks that affect virtual machine state, but normally called > asyncronously. > Argument: 8-byte operation id. > - REPLAY_ASYNC_EVENT_INPUT. Input device event. Contains > parameters of keyboard and mouse input operations > (key press/release, mouse pointer movement). > Arguments: 9-16 bytes depending of input event. > - REPLAY_ASYNC_EVENT_INPUT_SYNC. Internal input synchronization event. > - REPLAY_ASYNC_EVENT_CHAR_READ. Character (e.g., serial port) device input > initiated by the sender. > Arguments: 1-byte character device id. > Array with bytes were read. > - REPLAY_ASYNC_EVENT_BLOCK. Block device operation. Used to synchronize > operations with disk and flash drives with CPU. > Argument: 8-byte operation id. > - REPLAY_ASYNC_EVENT_NET. Incoming network packet. > Arguments: 1-byte network adapter id. > 4-byte packet flags. > Array with packet bytes. > - EVENT_SHUTDOWN. Occurs when user sends shutdown event to qemu, > e.g., by closing the window. > - EVENT_CHAR_WRITE. Used to synchronize character output operations. > Arguments: 4-byte output function return value. > 4-byte offset in the output array. > - EVENT_CHAR_READ_ALL. Used to synchronize character input operations, > initiated by qemu. > Argument: Array with bytes that were read. > - EVENT_CHAR_READ_ALL_ERROR. Unsuccessful character input operation, > initiated by qemu. > Argument: 4-byte error code. > - EVENT_CLOCK + clock_id. Group of events for host clock read operations. > Argument: 8-byte clock value. > - EVENT_CHECKPOINT + checkpoint_id. Checkpoint for synchronization of > CPU, internal threads, and asynchronous input events. May be followed > by one or more EVENT_ASYNC events. > - EVENT_END. Last event in the log. Thanks. That's really useful. Do you think it would be worth including the dumper script I wrote in ./scripts/? I mainly wrote it to aid my understanding of the code flow but it might be useful in future investigations? -- Alex Bennée
> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- > request@listserver.greensocs.com] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- > >> request@listserver.greensocs.com] > >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > <snip> > >> >> > I tested on vexpress-a9 platform with Debian wheezy. > >> >> > >> >> Thanks for that. I now have a test case that I can reproduce failures on > >> >> without needing graphics. > >> >> > >> >> I've been investigating if there are any problems with the timer > >> >> processing now they have been moved into the TCG thread. The record > >> >> stage seems to work fine but I'm having difficulty figuring out why > >> >> playback freezes. It seems we get to a point where we are stuck waiting > >> >> for a suspiciously exact timer deadline: > >> > > >> > I've encountered the following behavior at replay stage: > >> > - replay takes some instructions to execute (qemu_icount += counter) > >> > - virtual timer is fired > >> > >> This is the virtual timer based on icount not the virtual rt timer? So > > > > Regular virtual timer. It's value is based on icount. > > virtual_rt is used for internal icount purposes. > > And this is where the clock warps come in? The bias brings the > virtual_rt time forward because execution is waiting for some external > event to fire (e.g. a timer IRQ)? I guess so. But bias is not updated when the vCPU works. vCPU thread updates only qemu_icount which is used for virtual clock calculation. > >> under the new scheme of being processed in the vCPU loop we should only > >> fire this one once all execution is done (although you may exit the loop > >> early before icount is exhausted). > > > > We should stop the vCPU before processing virtual timers because: > > - virtual clock depends on instruction counter > > - virtual timers may change virtual hardware state > > If we do the processing in the top of main vCPU loop this should be > equivalent. The instruction counter cannot run because we haven't > entered tcg_exec_cpu. We also process virtual timers in this thread > outside the loop so nothing else can be poking the hardware state. This is how qemu worked in older version - it switched between processing tasks (vCPU and timers) in one thread. But how we'll join this behavior with the current design and MTTCG? > > >> > - replay puts back unexecuted instructions (qemu_icount -= counter) > >> > > >> > But virtual timer cannot take in account non-executed instructions (counter) and > >> > therefore it reads only qemu_icount, which provides incorrect time. > >> > > >> >> But the timers are all enabled: > >> >> > >> >> (gdb) qemu timers > >> >> Processing Realtime timers > >> >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 > >> >> Processing Virtual timers > >> >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 > >> >> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) > >> >> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) > >> >> Processing Host timers > >> >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 > >> >> Processing Virtual RT timers > >> >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 > >> > > >> > Timers are processed only at checkpoints recorded in the log. > >> > When replay is stuck, probably there is a pending checkpoint in the log > >> > and pending instructions in CPU (because iothread breaks its > >> > synchronization). > >> > >> How does this work for instructions that sleep (like WFI)? The WFI will > >> cause us to exit the loop but I assume we need a "real" timer to trigger > >> a firing of whatever the WFI is waiting for as time according to icount > >> isn't advancing. > > > > This is virtual_rt for. It advances even when vCPU is stopped. > > So I'm not seeing this happen: > > process_icount_data: icount not increased 117302332749=>117302332749 RT:-1 > > I wonder if the advance of virtual_rt has been deadlocked? The -1 > suggests its not even setup. qemu_start_warp_timer says that it is setup only when there are active virtual timers. > > By the way, I made small description of the replay log format > > to be later included into docs/replay.txt: > > > Thanks. That's really useful. > > Do you think it would be worth including the dumper script I wrote in > ./scripts/? I mainly wrote it to aid my understanding of the code flow > but it might be useful in future investigations? Usually I get human-readable logs in record and replay modes to find divergence between them, but I think your script might be useful in some cases. Pavel Dovgalyuk
On 31/03/2017 11:16, Pavel Dovgalyuk wrote: > > Do you think it would be worth including the dumper script I wrote in > ./scripts/? I mainly wrote it to aid my understanding of the code flow > but it might be useful in future investigations? Yes, I think it is. Alternatively, adding "-d rr" that dumps what is recorded/replayed from the rr file would be useful. Paolo
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> request@listserver.greensocs.com] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >> >> request@listserver.greensocs.com] >> >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> <snip> >> >> >> > I tested on vexpress-a9 platform with Debian wheezy. >> >> >> >> >> >> Thanks for that. I now have a test case that I can reproduce failures on >> >> >> without needing graphics. >> >> >> >> >> >> I've been investigating if there are any problems with the timer >> >> >> processing now they have been moved into the TCG thread. The record >> >> >> stage seems to work fine but I'm having difficulty figuring out why >> >> >> playback freezes. It seems we get to a point where we are stuck waiting >> >> >> for a suspiciously exact timer deadline: >> >> > >> >> > I've encountered the following behavior at replay stage: >> >> > - replay takes some instructions to execute (qemu_icount += counter) >> >> > - virtual timer is fired >> >> >> >> This is the virtual timer based on icount not the virtual rt timer? So >> > >> > Regular virtual timer. It's value is based on icount. >> > virtual_rt is used for internal icount purposes. >> >> And this is where the clock warps come in? The bias brings the >> virtual_rt time forward because execution is waiting for some external >> event to fire (e.g. a timer IRQ)? > > I guess so. But bias is not updated when the vCPU works. > vCPU thread updates only qemu_icount which is used for virtual clock calculation. > >> >> under the new scheme of being processed in the vCPU loop we should only >> >> fire this one once all execution is done (although you may exit the loop >> >> early before icount is exhausted). >> > >> > We should stop the vCPU before processing virtual timers because: >> > - virtual clock depends on instruction counter >> > - virtual timers may change virtual hardware state >> >> If we do the processing in the top of main vCPU loop this should be >> equivalent. The instruction counter cannot run because we haven't >> entered tcg_exec_cpu. We also process virtual timers in this thread >> outside the loop so nothing else can be poking the hardware state. > > This is how qemu worked in older version - it switched between > processing tasks (vCPU and timers) in one thread. The only real difference is the sequencing in the old case was protected by the BQL - now its my program order. > But how we'll join this behavior with the current design and MTTCG? Ignore MTTCG for now. We don't even try and run multiple-threads when icount is enabled. However the change in the BQL locking is what has triggered the failures. Anyway I think I'm getting closer to narrowing it down. On record I see replay_current_step jump backwards with this: /* A common event print, called when reading or saving an event */ static void print_event(uint8_t event) { static int event_count; static uint64_t last_step; uint64_t this_step = replay_get_current_step(); fprintf(stderr, "replay: event %d is %d @ step=%#" PRIx64 "\n", event_count, event, this_step); if (this_step < last_step) { fprintf(stderr,"%s: !!! step %d went backwards %#"PRIx64"=>%#"PRIx64"!!!\n", __func__, event_count, last_step, this_step); abort(); } last_step = this_step; event_count++; } void replay_put_event(uint8_t event) { assert(event < EVENT_COUNT); replay_put_byte(event); print_event(event); } The jump back is fairly consistent in my case (event 66 in the vexpress run) but I'm fairly sure that is the bug. I can't see any reason for the step count to go backwards - indeed that breaks the premise of . Unfortunately when I start putting break and watchpoints in to see how this jump back occurs the problem goes away until I disable them. So this very much looks like a race condition corrupting the icount data. >> >> > - replay puts back unexecuted instructions (qemu_icount -= counter) >> >> > >> >> > But virtual timer cannot take in account non-executed instructions (counter) and >> >> > therefore it reads only qemu_icount, which provides incorrect time. >> >> > >> >> >> But the timers are all enabled: >> >> >> >> >> >> (gdb) qemu timers >> >> >> Processing Realtime timers >> >> >> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 >> >> >> Processing Virtual timers >> >> >> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 >> >> >> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) >> >> >> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) >> >> >> Processing Host timers >> >> >> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 >> >> >> Processing Virtual RT timers >> >> >> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 >> >> > >> >> > Timers are processed only at checkpoints recorded in the log. >> >> > When replay is stuck, probably there is a pending checkpoint in the log >> >> > and pending instructions in CPU (because iothread breaks its >> >> > synchronization). >> >> >> >> How does this work for instructions that sleep (like WFI)? The WFI will >> >> cause us to exit the loop but I assume we need a "real" timer to trigger >> >> a firing of whatever the WFI is waiting for as time according to icount >> >> isn't advancing. >> > >> > This is virtual_rt for. It advances even when vCPU is stopped. >> >> So I'm not seeing this happen: >> >> process_icount_data: icount not increased 117302332749=>117302332749 RT:-1 >> >> I wonder if the advance of virtual_rt has been deadlocked? The -1 >> suggests its not even setup. > > qemu_start_warp_timer says that it is setup only when there are active virtual timers. > > >> > By the way, I made small description of the replay log format >> > to be later included into docs/replay.txt: >> > >> Thanks. That's really useful. >> >> Do you think it would be worth including the dumper script I wrote in >> ./scripts/? I mainly wrote it to aid my understanding of the code flow >> but it might be useful in future investigations? > > Usually I get human-readable logs in record and replay modes to find > divergence between them, but I think your script might be useful in some cases. Cool. I'll include it in the icount series once I have something to post (fingers crossed I'm getting close). -- Alex Bennée
On 31/03/2017 13:21, Alex Bennée wrote: > Anyway I think I'm getting closer to narrowing it down. On record I see > replay_current_step jump backwards with this: > > /* A common event print, called when reading or saving an event */ > static void print_event(uint8_t event) > { > static int event_count; > static uint64_t last_step; > uint64_t this_step = replay_get_current_step(); > > fprintf(stderr, "replay: event %d is %d @ step=%#" PRIx64 "\n", > event_count, event, this_step); > > if (this_step < last_step) { > fprintf(stderr,"%s: !!! step %d went backwards %#"PRIx64"=>%#"PRIx64"!!!\n", > __func__, event_count, last_step, this_step); > abort(); > } > last_step = this_step; > event_count++; > } > > void replay_put_event(uint8_t event) > { > assert(event < EVENT_COUNT); > replay_put_byte(event); > print_event(event); > } > > The jump back is fairly consistent in my case (event 66 in the vexpress > run) but I'm fairly sure that is the bug. I can't see any reason for the > step count to go backwards - indeed that breaks the premise of . Good catch! I suspect it's the "else" case in cpu_get_icount_raw: icount = timers_state.qemu_icount; if (cpu) { if (!cpu->can_do_io) { fprintf(stderr, "Bad icount read\n"); exit(1); } icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); } Between timers_state.qemu_icount += count; and timers_state.qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); the I/O thread can read a value that is later rolled back. I think you need to do this the other way round: add something to icount in cpu_get_icount_raw rather than taking it off: icount = timers_state.qemu_icount; if (cpu) { if (!cpu->can_do_io) { fprintf(stderr, "Bad icount read\n"); exit(1); } icount += cpu->pending_icount - (cpu->icount_decr.u16.low + cpu->icount_extra); } where cpu->pending_icount is set before cpu_exec, and folded into timers_state.qemu_icount afterwards. Also, here: if (use_icount) { int64_t count; int decr; timers_state.qemu_icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); cpu->icount_decr.u16.low = 0; cpu->icount_extra = 0; this should be dead code because tcg_cpu_exec also clears the two decrementer fields. So if you can replace the three assignments with assertions on cpu->icount_decr.u16.low and cpu->icount_extra, that would also simplify the code and remove race opportunities. Paolo >>>>>> - replay puts back unexecuted instructions (qemu_icount -= counter) >>>>>> >>>>>> But virtual timer cannot take in account non-executed instructions (counter) and >>>>>> therefore it reads only qemu_icount, which provides incorrect time. >>>>>> >>>>>>> But the timers are all enabled: >>>>>>> >>>>>>> (gdb) qemu timers >>>>>>> Processing Realtime timers >>>>>>> clock QEMU_CLOCK_REALTIME is enabled:true, last:-9223372036854775808 >>>>>>> Processing Virtual timers >>>>>>> clock QEMU_CLOCK_VIRTUAL is enabled:true, last:-9223372036854775808 >>>>>>> timer 34297350016/1 (cb:0x555555a2e952 <ptimer_tick>) >>>>>>> timer 503290000000/1000000 (cb:0x555555bd4d1d <ra_timer_handler>) >>>>>>> Processing Host timers >>>>>>> clock QEMU_CLOCK_HOST is enabled:true, last:1490191319270134000 >>>>>>> Processing Virtual RT timers >>>>>>> clock QEMU_CLOCK_VIRTUAL_RT is enabled:true, last:-9223372036854775808 >>>>>> >>>>>> Timers are processed only at checkpoints recorded in the log. >>>>>> When replay is stuck, probably there is a pending checkpoint in the log >>>>>> and pending instructions in CPU (because iothread breaks its >>>>>> synchronization). >>>>> >>>>> How does this work for instructions that sleep (like WFI)? The WFI will >>>>> cause us to exit the loop but I assume we need a "real" timer to trigger >>>>> a firing of whatever the WFI is waiting for as time according to icount >>>>> isn't advancing. >>>> >>>> This is virtual_rt for. It advances even when vCPU is stopped. >>> >>> So I'm not seeing this happen: >>> >>> process_icount_data: icount not increased 117302332749=>117302332749 RT:-1 >>> >>> I wonder if the advance of virtual_rt has been deadlocked? The -1 >>> suggests its not even setup. >> >> qemu_start_warp_timer says that it is setup only when there are active virtual timers. >> >> >>>> By the way, I made small description of the replay log format >>>> to be later included into docs/replay.txt: >>>> >>> Thanks. That's really useful. >>> >>> Do you think it would be worth including the dumper script I wrote in >>> ./scripts/? I mainly wrote it to aid my understanding of the code flow >>> but it might be useful in future investigations? >> >> Usually I get human-readable logs in record and replay modes to find >> divergence between them, but I think your script might be useful in some cases. > > Cool. I'll include it in the icount series once I have something to post > (fingers crossed I'm getting close). > > -- > Alex Bennée >
Alex Bennée <alex.bennee@linaro.org> writes: > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >>> From: mttcg-request@listserver.greensocs.com [mailto:mttcg-request@listserver.greensocs.com] >>> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >>> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >>> request@listserver.greensocs.com] >>> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >>> >> >> From: mttcg-request@listserver.greensocs.com [mailto:mttcg- >>> >> request@listserver.greensocs.com] >>> >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >>> <snip> >>> >> >> > I tested on vexpress-a9 platform with Debian wheezy. >>> >> >> >>> >> >> Thanks for that. I now have a test case that I can reproduce failures on >>> >> >> without needing graphics. >>> >> >> >>> >> >> I've been investigating if there are any problems with the timer >>> >> >> processing now they have been moved into the TCG thread. The record >>> >> >> stage seems to work fine but I'm having difficulty figuring out why >>> >> >> playback freezes. It seems we get to a point where we are stuck waiting >>> >> >> for a suspiciously exact timer deadline: >>> >> > >>> >> > I've encountered the following behavior at replay stage: >>> >> > - replay takes some instructions to execute (qemu_icount += counter) >>> >> > - virtual timer is fired >>> >> >>> >> This is the virtual timer based on icount not the virtual rt timer? So >>> > >>> > Regular virtual timer. It's value is based on icount. >>> > virtual_rt is used for internal icount purposes. >>> >>> And this is where the clock warps come in? The bias brings the >>> virtual_rt time forward because execution is waiting for some external >>> event to fire (e.g. a timer IRQ)? >> >> I guess so. But bias is not updated when the vCPU works. >> vCPU thread updates only qemu_icount which is used for virtual clock calculation. >> >>> >> under the new scheme of being processed in the vCPU loop we should only >>> >> fire this one once all execution is done (although you may exit the loop >>> >> early before icount is exhausted). >>> > >>> > We should stop the vCPU before processing virtual timers because: >>> > - virtual clock depends on instruction counter >>> > - virtual timers may change virtual hardware state >>> >>> If we do the processing in the top of main vCPU loop this should be >>> equivalent. The instruction counter cannot run because we haven't >>> entered tcg_exec_cpu. We also process virtual timers in this thread >>> outside the loop so nothing else can be poking the hardware state. >> >> This is how qemu worked in older version - it switched between >> processing tasks (vCPU and timers) in one thread. > > The only real difference is the sequencing in the old case was protected > by the BQL - now its my program order. > >> But how we'll join this behavior with the current design and MTTCG? > > Ignore MTTCG for now. We don't even try and run multiple-threads when > icount is enabled. However the change in the BQL locking is what has > triggered the failures. > > Anyway I think I'm getting closer to narrowing it down. On record I see > replay_current_step jump backwards with this: > > /* A common event print, called when reading or saving an event */ > static void print_event(uint8_t event) > { > static int event_count; > static uint64_t last_step; > uint64_t this_step = replay_get_current_step(); > > fprintf(stderr, "replay: event %d is %d @ step=%#" PRIx64 "\n", > event_count, event, this_step); > > if (this_step < last_step) { > fprintf(stderr,"%s: !!! step %d went backwards %#"PRIx64"=>%#"PRIx64"!!!\n", > __func__, event_count, last_step, this_step); > abort(); > } > last_step = this_step; > event_count++; > } > > void replay_put_event(uint8_t event) > { > assert(event < EVENT_COUNT); > replay_put_byte(event); > print_event(event); > } > > The jump back is fairly consistent in my case (event 66 in the vexpress > run) but I'm fairly sure that is the bug. I can't see any reason for the > step count to go backwards - indeed that breaks the premise of . > > Unfortunately when I start putting break and watchpoints in to see how > this jump back occurs the problem goes away until I disable them. So > this very much looks like a race condition corrupting the icount data. So this is a symptom of cpu_get_icount_raw(void) only accounting for in progress instructions when in the vCPU context and the fact timers_state.qemu_icount is "in credit" while the vCPU is running. -- Alex Bennée
Paolo Bonzini <pbonzini@redhat.com> writes: > On 31/03/2017 13:21, Alex Bennée wrote: >> Anyway I think I'm getting closer to narrowing it down. On record I see >> replay_current_step jump backwards with this: >> >> /* A common event print, called when reading or saving an event */ >> static void print_event(uint8_t event) >> { >> static int event_count; >> static uint64_t last_step; >> uint64_t this_step = replay_get_current_step(); >> >> fprintf(stderr, "replay: event %d is %d @ step=%#" PRIx64 "\n", >> event_count, event, this_step); >> >> if (this_step < last_step) { >> fprintf(stderr,"%s: !!! step %d went backwards %#"PRIx64"=>%#"PRIx64"!!!\n", >> __func__, event_count, last_step, this_step); >> abort(); >> } >> last_step = this_step; >> event_count++; >> } >> >> void replay_put_event(uint8_t event) >> { >> assert(event < EVENT_COUNT); >> replay_put_byte(event); >> print_event(event); >> } >> >> The jump back is fairly consistent in my case (event 66 in the vexpress >> run) but I'm fairly sure that is the bug. I can't see any reason for the >> step count to go backwards - indeed that breaks the premise of . > > Good catch! I suspect it's the "else" case in cpu_get_icount_raw: > > icount = timers_state.qemu_icount; > if (cpu) { > if (!cpu->can_do_io) { > fprintf(stderr, "Bad icount read\n"); > exit(1); > } > icount -= (cpu->icount_decr.u16.low + cpu->icount_extra); > } > > Between > > timers_state.qemu_icount += count; > > and > > timers_state.qemu_icount -= (cpu->icount_decr.u16.low > + cpu->icount_extra); > > the I/O thread can read a value that is later rolled back. I think you > need to do this the other way round: add something to icount in > cpu_get_icount_raw rather than taking it off: > > icount = timers_state.qemu_icount; > if (cpu) { > if (!cpu->can_do_io) { > fprintf(stderr, "Bad icount read\n"); > exit(1); > } > icount += cpu->pending_icount > - (cpu->icount_decr.u16.low + cpu->icount_extra); > } > > where cpu->pending_icount is set before cpu_exec, and folded into > timers_state.qemu_icount afterwards. > > Also, here: > > if (use_icount) { > int64_t count; > int decr; > timers_state.qemu_icount -= (cpu->icount_decr.u16.low > + cpu->icount_extra); > cpu->icount_decr.u16.low = 0; > cpu->icount_extra = 0; > > this should be dead code because tcg_cpu_exec also clears the two > decrementer fields. So if you can replace the three assignments with > assertions on cpu->icount_decr.u16.low and cpu->icount_extra, that would > also simplify the code and remove race opportunities. I'll have a look at that on Monday. I wrote this before I saw your email: https://github.com/stsquad/qemu/tree/mttcg/debug-record-replay-v1 It fixes the race so time only ever goes forward. Combined with the following patches it also makes the record/replay streams tolerant of "missing the boat". With this I can do a initrd run of the vexpress kernel in both record and playback. I'm not sure it counts as deterministic though - the vCPU and main-loop thread seem to figure it out as the go along but I suspect if we really want to be sure we push the replay_lock() further up. This would ensure all related events from whichever thread are pushed together. This is very much a come back on Monday and see if it still seems like a good idea in the cold light of morning ;-) -- Alex Bennée