Message ID | 1457949585-191064-4-git-send-email-wangnan0@huawei.com |
---|---|
State | Superseded |
Headers | show |
On 2016/3/23 17:50, Peter Zijlstra wrote: > On Mon, Mar 14, 2016 at 09:59:43AM +0000, Wang Nan wrote: >> Convert perf_output_begin to __perf_output_begin and make the later >> function able to write records from the end of the ring buffer. >> Following commits will utilize the 'backward' flag. >> >> This patch doesn't introduce any extra performance overhead since we >> use always_inline. > So while I agree that with __always_inline and constant propagation we > _should_ end up with the same code, we have: > > $ size defconfig-build/kernel/events/ring_buffer.o.{pre,post} > text data bss dec hex filename > 3785 2 0 3787 ecb defconfig-build/kernel/events/ring_buffer.o.pre > 3673 2 0 3675 e5b defconfig-build/kernel/events/ring_buffer.o.post > > The patch actually makes the file shrink. > > So I think we still want to have some actual performance numbers. There are some numbers. You can find them from: http://lkml.iu.edu/hypermail/linux/kernel/1601.2/03966.html Thank you.
On 2016/3/24 3:25, Alexei Starovoitov wrote: > On Wed, Mar 23, 2016 at 06:08:41PM +0800, Wangnan (F) wrote: >> >> On 2016/3/23 17:50, Peter Zijlstra wrote: >>> On Mon, Mar 14, 2016 at 09:59:43AM +0000, Wang Nan wrote: >>>> Convert perf_output_begin to __perf_output_begin and make the later >>>> function able to write records from the end of the ring buffer. >>>> Following commits will utilize the 'backward' flag. >>>> >>>> This patch doesn't introduce any extra performance overhead since we >>>> use always_inline. >>> So while I agree that with __always_inline and constant propagation we >>> _should_ end up with the same code, we have: >>> >>> $ size defconfig-build/kernel/events/ring_buffer.o.{pre,post} >>> text data bss dec hex filename >>> 3785 2 0 3787 ecb defconfig-build/kernel/events/ring_buffer.o.pre >>> 3673 2 0 3675 e5b defconfig-build/kernel/events/ring_buffer.o.post >>> >>> The patch actually makes the file shrink. >>> >>> So I think we still want to have some actual performance numbers. >> There are some numbers. You can find them from: >> >> http://lkml.iu.edu/hypermail/linux/kernel/1601.2/03966.html > Wang, when you respin, please add all perf analysis that you've > done and the reasons to do it this way to commit log > to make sure it stays in git history. > > I've reviewed it in the past and looks like the patches didn't > change over the last months. So still ack, I believe > overwrite buffers is a great feature. Can I add your Acked-by in all these 5 patches? Thank you.
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 22e1a47..37c11c6 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -102,8 +102,21 @@ out: preempt_enable(); } -int perf_output_begin(struct perf_output_handle *handle, - struct perf_event *event, unsigned int size) +static bool __always_inline +ring_buffer_has_space(unsigned long head, unsigned long tail, + unsigned long data_size, unsigned int size, + bool backward) +{ + if (!backward) + return CIRC_SPACE(head, tail, data_size) >= size; + else + return CIRC_SPACE(tail, head, data_size) >= size; +} + +static int __always_inline +__perf_output_begin(struct perf_output_handle *handle, + struct perf_event *event, unsigned int size, + bool backward) { struct ring_buffer *rb; unsigned long tail, offset, head; @@ -146,9 +159,12 @@ int perf_output_begin(struct perf_output_handle *handle, do { tail = READ_ONCE(rb->user_page->data_tail); offset = head = local_read(&rb->head); - if (!rb->overwrite && - unlikely(CIRC_SPACE(head, tail, perf_data_size(rb)) < size)) - goto fail; + if (!rb->overwrite) { + if (unlikely(!ring_buffer_has_space(head, tail, + perf_data_size(rb), + size, backward))) + goto fail; + } /* * The above forms a control dependency barrier separating the @@ -162,9 +178,17 @@ int perf_output_begin(struct perf_output_handle *handle, * See perf_output_put_handle(). */ - head += size; + if (!backward) + head += size; + else + head -= size; } while (local_cmpxchg(&rb->head, offset, head) != offset); + if (backward) { + offset = head; + head = (u64)(-head); + } + /* * We rely on the implied barrier() by local_cmpxchg() to ensure * none of the data stores below can be lifted up by the compiler. @@ -206,6 +230,12 @@ out: return -ENOSPC; } +int perf_output_begin(struct perf_output_handle *handle, + struct perf_event *event, unsigned int size) +{ + return __perf_output_begin(handle, event, size, false); +} + unsigned int perf_output_copy(struct perf_output_handle *handle, const void *buf, unsigned int len) {
Convert perf_output_begin to __perf_output_begin and make the later function able to write records from the end of the ring buffer. Following commits will utilize the 'backward' flag. This patch doesn't introduce any extra performance overhead since we use always_inline. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: He Kuang <hekuang@huawei.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com --- kernel/events/ring_buffer.c | 42 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 36 insertions(+), 6 deletions(-) -- 1.8.3.4