Message ID | 1459160189-240780-1-git-send-email-wangnan0@huawei.com |
---|---|
State | New |
Headers | show |
Thanks for this patch, Wangnan. Vince, do you have any comments? Cheers, Michael On 03/28/2016 12:16 PM, Wang Nan wrote: > Signed-off-by: Wang Nan <wangnan0@huawei.com> > --- > man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 55 insertions(+), 2 deletions(-) > > diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 > index b232cba..942a410 100644 > --- a/man2/perf_event_open.2 > +++ b/man2/perf_event_open.2 > @@ -234,8 +234,10 @@ struct perf_event_attr { > mmap2 : 1, /* include mmap with inode data */ > comm_exec : 1, /* flag comm events that are due to exec */ > use_clockid : 1, /* use clockid for time fields */ > + context_switch : 1, /* context switch data */ > + write_backward : 1, /* Write ring buffer from end to beginning */ > > - __reserved_1 : 38; > + __reserved_1 : 36; > > union { > __u32 wakeup_events; /* wakeup every n events */ > @@ -1105,6 +1107,30 @@ field. > This can make it easier to correlate perf sample times with > timestamps generated by other tools. > .TP > +.IR "write_backward" " (since Linux 4.6)" > +.\" commit ? (http://lkml.kernel.org/g/1459147292-239310-5-git-send-email-wangnan0@huawei.com) > +This makes the resuling event use a backward ring-buffer, which > +writes samples from the end of the ring-buffer. > + > +It is not allowed to connect events with backward and forward > +ring-buffer settings together using > +.B PERF_EVENT_IOC_SET_OUTPUT. > + > +Backward ring-buffer is useful when the ring-buffer is overwritable > +(created by readonly > +.BR mmap (2) > +). In this case, > +.IR data_tail > +is useless, > +.IR data_head > +points to the head of the most recent sample in a backward > +ring-buffer. It is easy to iterate over the whole ring-buffer by reading > +samples one by one because size of a sample can be found from decoding > +its header. In contract, in a forward overwritable ring-buffer, the only > +information is the end of the most recent sample which is pointed by > +.IR data_head, > +but the size of a sample can't be determined from the end of it. > +.TP > .IR "wakeup_events" ", " "wakeup_watermark" > This union sets how many samples > .RI ( wakeup_events ) > @@ -1634,7 +1660,9 @@ And vice versa: > .TP > .I data_head > This points to the head of the data section. > -The value continuously increases, it does not wrap. > +The value continuously increases (or decrease if > +.IR write_backward > +is set), it does not wrap. > The value needs to be manually wrapped by the size of the mmap buffer > before accessing the samples. > > @@ -2581,6 +2609,24 @@ Starting with Linux 3.18, > .B POLL_HUP > is indicated if the event being monitored is attached to a different > process and that process exits. > +.SS Reading from overwritable ring-buffer > +Reader is unable to update > +.IR data_tail > +if the mapping is not > +.BR PROT_WRITE . > +In this case, kernel will overwrite data without considering whether > +they are read or not, so ring-buffer is overwritable and > +behaves like a flight recorder. To read from an overwritable > +ring-buffer, setting > +.IR write_backward > +is suggested, or it would be hard to find a proper position to start > +decoding. In addition, ring-buffer should be paused before reading > +through > +.BR ioctl (2) > +with > +.B PERF_EVENT_IOC_PAUSE_OUTPUT > +to avoid racing between kernel and reader. Ring-buffer should be resumed > +after finish reading. > .SS rdpmc instruction > Starting with Linux 3.4 on x86, you can use the > .\" commit c7206205d00ab375839bd6c7ddb247d600693c09 > @@ -2693,6 +2739,13 @@ The file descriptors must all be on the same CPU. > > The argument specifies the desired file descriptor, or \-1 if > output should be ignored. > + > +Two events with different > +.IR write_backward > +settings are not allowed to be connected together using > +.B PERF_EVENT_IOC_SET_OUTPUT. > +.B EINVAL > +is returned in this case. > .TP > .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)" > .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830 > -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index b232cba..942a410 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -234,8 +234,10 @@ struct perf_event_attr { mmap2 : 1, /* include mmap with inode data */ comm_exec : 1, /* flag comm events that are due to exec */ use_clockid : 1, /* use clockid for time fields */ + context_switch : 1, /* context switch data */ + write_backward : 1, /* Write ring buffer from end to beginning */ - __reserved_1 : 38; + __reserved_1 : 36; union { __u32 wakeup_events; /* wakeup every n events */ @@ -1105,6 +1107,30 @@ field. This can make it easier to correlate perf sample times with timestamps generated by other tools. .TP +.IR "write_backward" " (since Linux 4.6)" +.\" commit ? (http://lkml.kernel.org/g/1459147292-239310-5-git-send-email-wangnan0@huawei.com) +This makes the resuling event use a backward ring-buffer, which +writes samples from the end of the ring-buffer. + +It is not allowed to connect events with backward and forward +ring-buffer settings together using +.B PERF_EVENT_IOC_SET_OUTPUT. + +Backward ring-buffer is useful when the ring-buffer is overwritable +(created by readonly +.BR mmap (2) +). In this case, +.IR data_tail +is useless, +.IR data_head +points to the head of the most recent sample in a backward +ring-buffer. It is easy to iterate over the whole ring-buffer by reading +samples one by one because size of a sample can be found from decoding +its header. In contract, in a forward overwritable ring-buffer, the only +information is the end of the most recent sample which is pointed by +.IR data_head, +but the size of a sample can't be determined from the end of it. +.TP .IR "wakeup_events" ", " "wakeup_watermark" This union sets how many samples .RI ( wakeup_events ) @@ -1634,7 +1660,9 @@ And vice versa: .TP .I data_head This points to the head of the data section. -The value continuously increases, it does not wrap. +The value continuously increases (or decrease if +.IR write_backward +is set), it does not wrap. The value needs to be manually wrapped by the size of the mmap buffer before accessing the samples. @@ -2581,6 +2609,24 @@ Starting with Linux 3.18, .B POLL_HUP is indicated if the event being monitored is attached to a different process and that process exits. +.SS Reading from overwritable ring-buffer +Reader is unable to update +.IR data_tail +if the mapping is not +.BR PROT_WRITE . +In this case, kernel will overwrite data without considering whether +they are read or not, so ring-buffer is overwritable and +behaves like a flight recorder. To read from an overwritable +ring-buffer, setting +.IR write_backward +is suggested, or it would be hard to find a proper position to start +decoding. In addition, ring-buffer should be paused before reading +through +.BR ioctl (2) +with +.B PERF_EVENT_IOC_PAUSE_OUTPUT +to avoid racing between kernel and reader. Ring-buffer should be resumed +after finish reading. .SS rdpmc instruction Starting with Linux 3.4 on x86, you can use the .\" commit c7206205d00ab375839bd6c7ddb247d600693c09 @@ -2693,6 +2739,13 @@ The file descriptors must all be on the same CPU. The argument specifies the desired file descriptor, or \-1 if output should be ignored. + +Two events with different +.IR write_backward +settings are not allowed to be connected together using +.B PERF_EVENT_IOC_SET_OUTPUT. +.B EINVAL +is returned in this case. .TP .BR PERF_EVENT_IOC_SET_FILTER " (since Linux 2.6.33)" .\" commit 6fb2915df7f0747d9044da9dbff5b46dc2e20830
Signed-off-by: Wang Nan <wangnan0@huawei.com> --- man2/perf_event_open.2 | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 55 insertions(+), 2 deletions(-) -- 1.8.3.4