Message ID | 1601976833-24377-1-git-send-email-gkohli@codeaurora.org |
---|---|
State | Superseded |
Headers | show |
Series | [v1] trace: Fix race in trace_open and buffer resize call | expand |
Hi, This patch (CVE-2020-27825) was tagged with Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") I'm not an expert here but it seems like b23d7a5f4a07a only refactored ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is b23d7a5f4a07a the actual cause of the bug? Thanks, Denis On 10/6/20 12:33 PM, Gaurav Kohli wrote: > Below race can come, if trace_open and resize of > cpu buffer is running parallely on different cpus > CPUX CPUY > ring_buffer_resize > atomic_read(&buffer->resize_disabled) > tracing_open > tracing_reset_online_cpus > ring_buffer_reset_cpu > rb_reset_cpu > rb_update_pages > remove/insert pages > resetting pointer > > This race can cause data abort or some times infinte loop in > rb_remove_pages and rb_insert_pages while checking pages > for sanity. > > Take buffer lock to fix this. > > Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org> > Cc: stable@vger.kernel.org > --- > Changes since v0: > -Addressed Steven's review comments. > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index 93ef0ab..15bf28b 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -4866,6 +4866,9 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) > if (!cpumask_test_cpu(cpu, buffer->cpumask)) > return; > > + /* prevent another thread from changing buffer sizes */ > + mutex_lock(&buffer->mutex); > + > atomic_inc(&cpu_buffer->resize_disabled); > atomic_inc(&cpu_buffer->record_disabled); > > @@ -4876,6 +4879,8 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) > > atomic_dec(&cpu_buffer->record_disabled); > atomic_dec(&cpu_buffer->resize_disabled); > + > + mutex_unlock(&buffer->mutex); > } > EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); > > @@ -4889,6 +4894,9 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) > struct ring_buffer_per_cpu *cpu_buffer; > int cpu; > > + /* prevent another thread from changing buffer sizes */ > + mutex_lock(&buffer->mutex); > + > for_each_online_buffer_cpu(buffer, cpu) { > cpu_buffer = buffer->buffers[cpu]; > > @@ -4907,6 +4915,8 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) > atomic_dec(&cpu_buffer->record_disabled); > atomic_dec(&cpu_buffer->resize_disabled); > } > + > + mutex_unlock(&buffer->mutex); > } > > /** >
On Thu, 21 Jan 2021 17:30:40 +0300 Denis Efremov <efremov@linux.com> wrote: > Hi, > > This patch (CVE-2020-27825) was tagged with > Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") > > I'm not an expert here but it seems like b23d7a5f4a07a only refactored > ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without > significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) > can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is > b23d7a5f4a07a the actual cause of the bug? > Ug, that looks to be a mistake. Looking back at the thread about this: https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local.home/ That should have been: Depends-on: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") -- Steve
On 1/21/21 10:09 PM, Steven Rostedt wrote: > On Thu, 21 Jan 2021 17:30:40 +0300 > Denis Efremov <efremov@linux.com> wrote: > >> Hi, >> >> This patch (CVE-2020-27825) was tagged with >> Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") >> >> I'm not an expert here but it seems like b23d7a5f4a07a only refactored >> ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without >> significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) >> can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is >> b23d7a5f4a07a the actual cause of the bug? >> > > Ug, that looks to be a mistake. Looking back at the thread about this: > > https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local.home/ I see from the link that it was planned to backport the patch to LTS kernels: > Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not > present in stable branches: > Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered this while trying to formalize CVE-2020-27825 bug in cvehound https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825.cocci I think that the backport to the 4.4+ should be something like: diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 547a3a5ac57b..2171b377bbc1 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return; + mutex_lock(&buffer->mutex); + atomic_inc(&buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled); @@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&buffer->resize_disabled); + + mutex_unlock(&buffer->mutex); } EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); Thanks, Denis
On Thu, 21 Jan 2021 23:15:22 +0300 Denis Efremov <efremov@linux.com> wrote: > On 1/21/21 10:09 PM, Steven Rostedt wrote: > > On Thu, 21 Jan 2021 17:30:40 +0300 > > Denis Efremov <efremov@linux.com> wrote: > > > >> Hi, > >> > >> This patch (CVE-2020-27825) was tagged with > >> Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") > >> > >> I'm not an expert here but it seems like b23d7a5f4a07a only refactored > >> ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without > >> significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) > >> can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is > >> b23d7a5f4a07a the actual cause of the bug? > >> > > > > Ug, that looks to be a mistake. Looking back at the thread about this: > > > > https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local.home/ > > I see from the link that it was planned to backport the patch to LTS kernels: > > > Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not > > present in stable branches: > > Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") > > The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered > this while trying to formalize CVE-2020-27825 bug in cvehound > https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825.cocci > > I think that the backport to the 4.4+ should be something like: > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > index 547a3a5ac57b..2171b377bbc1 100644 > --- a/kernel/trace/ring_buffer.c > +++ b/kernel/trace/ring_buffer.c > @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) > if (!cpumask_test_cpu(cpu, buffer->cpumask)) > return; > > + mutex_lock(&buffer->mutex); > + > atomic_inc(&buffer->resize_disabled); > atomic_inc(&cpu_buffer->record_disabled); > > @@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) > > atomic_dec(&cpu_buffer->record_disabled); > atomic_dec(&buffer->resize_disabled); > + > + mutex_unlock(&buffer->mutex); > } > EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); > That could possibly work. -- Steve
On Thu, Jan 21, 2021 at 03:37:32PM -0500, Steven Rostedt wrote: > On Thu, 21 Jan 2021 23:15:22 +0300 > Denis Efremov <efremov@linux.com> wrote: > > > On 1/21/21 10:09 PM, Steven Rostedt wrote: > > > On Thu, 21 Jan 2021 17:30:40 +0300 > > > Denis Efremov <efremov@linux.com> wrote: > > > > > >> Hi, > > >> > > >> This patch (CVE-2020-27825) was tagged with > > >> Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") > > >> > > >> I'm not an expert here but it seems like b23d7a5f4a07a only refactored > > >> ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without > > >> significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) > > >> can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is > > >> b23d7a5f4a07a the actual cause of the bug? > > >> > > > > > > Ug, that looks to be a mistake. Looking back at the thread about this: > > > > > > https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local.home/ > > > > I see from the link that it was planned to backport the patch to LTS kernels: > > > > > Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not > > > present in stable branches: > > > Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") > > > > The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered > > this while trying to formalize CVE-2020-27825 bug in cvehound > > https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825.cocci > > > > I think that the backport to the 4.4+ should be something like: > > > > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c > > index 547a3a5ac57b..2171b377bbc1 100644 > > --- a/kernel/trace/ring_buffer.c > > +++ b/kernel/trace/ring_buffer.c > > @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) > > if (!cpumask_test_cpu(cpu, buffer->cpumask)) > > return; > > > > + mutex_lock(&buffer->mutex); > > + > > atomic_inc(&buffer->resize_disabled); > > atomic_inc(&cpu_buffer->record_disabled); > > > > @@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) > > > > atomic_dec(&cpu_buffer->record_disabled); > > atomic_dec(&buffer->resize_disabled); > > + > > + mutex_unlock(&buffer->mutex); > > } > > EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); > > > > That could possibly work. Ok, so what can I do here? Can someone resend this as a backport to the other stable kernels in this way so that I can queue it up? thanks, greg k-h
On 1/22/2021 4:29 PM, Greg KH wrote: > On Thu, Jan 21, 2021 at 03:37:32PM -0500, Steven Rostedt wrote: >> On Thu, 21 Jan 2021 23:15:22 +0300 >> Denis Efremov <efremov@linux.com> wrote: >> >>> On 1/21/21 10:09 PM, Steven Rostedt wrote: >>>> On Thu, 21 Jan 2021 17:30:40 +0300 >>>> Denis Efremov <efremov@linux.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> This patch (CVE-2020-27825) was tagged with >>>>> Fixes: b23d7a5f4a07a ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") >>>>> >>>>> I'm not an expert here but it seems like b23d7a5f4a07a only refactored >>>>> ring_buffer_reset_cpu() by introducing reset_disabled_cpu_buffer() without >>>>> significant changes. Hence, mutex_lock(&buffer->mutex)/mutex_unlock(&buffer->mutex) >>>>> can be backported further than b23d7a5f4a07a~ and to all LTS kernels. Is >>>>> b23d7a5f4a07a the actual cause of the bug? >>>>> >>>> >>>> Ug, that looks to be a mistake. Looking back at the thread about this: >>>> >>>> https://lore.kernel.org/linux-arm-msm/20200915141304.41fa7c30@gandalf.local.home/ >>> >>> I see from the link that it was planned to backport the patch to LTS kernels: >>> >>>> Actually we are seeing issue in older kernel like 4.19/4.14/5.4 and there below patch was not >>>> present in stable branches: >>>> Commit b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") >>> >>> The point is that it's not backported yet. Maybe because of Fixes tag. I've discovered >>> this while trying to formalize CVE-2020-27825 bug in cvehound >>> https://github.com/evdenis/cvehound/blob/master/cvehound/cve/CVE-2020-27825.cocci >>> >>> I think that the backport to the 4.4+ should be something like: >>> >>> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c >>> index 547a3a5ac57b..2171b377bbc1 100644 >>> --- a/kernel/trace/ring_buffer.c >>> +++ b/kernel/trace/ring_buffer.c >>> @@ -4295,6 +4295,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) >>> if (!cpumask_test_cpu(cpu, buffer->cpumask)) >>> return; >>> >>> + mutex_lock(&buffer->mutex); >>> + >>> atomic_inc(&buffer->resize_disabled); >>> atomic_inc(&cpu_buffer->record_disabled); >>> >>> @@ -4317,6 +4319,8 @@ void ring_buffer_reset_cpu(struct ring_buffer *buffer, int cpu) >>> >>> atomic_dec(&cpu_buffer->record_disabled); >>> atomic_dec(&buffer->resize_disabled); >>> + >>> + mutex_unlock(&buffer->mutex); >>> } >>> EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); >>> >> >> That could possibly work. Yes, this will work, As i have tested similar patch for internal testing for kernel branches like 5.4/4.19. > > Ok, so what can I do here? Can someone resend this as a backport to the > other stable kernels in this way so that I can queue it up? > > thanks, > > greg k-h > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Fri, 22 Jan 2021 16:55:29 +0530 Gaurav Kohli <gkohli@codeaurora.org> wrote: > >> That could possibly work. > > Yes, this will work, As i have tested similar patch for internal testing > for kernel branches like 5.4/4.19. Can you or Denis send a proper patch for Greg to backport? I'll review it, test it and give my ack to it, so Greg can take it without issue. Thanks! -- Steve > > > > > Ok, so what can I do here? Can someone resend this as a backport to the > > other stable kernels in this way so that I can queue it up? > >
On 1/22/21 5:37 PM, Steven Rostedt wrote: > On Fri, 22 Jan 2021 16:55:29 +0530 > Gaurav Kohli <gkohli@codeaurora.org> wrote: > >>>> That could possibly work. >> >> Yes, this will work, As i have tested similar patch for internal testing >> for kernel branches like 5.4/4.19. > > Can you or Denis send a proper patch for Greg to backport? I'll review it, > test it and give my ack to it, so Greg can take it without issue. > I can prepare the patch, but it will be compile-tested only from my side. Honestly, I think it's better when the patch and its backports have the same author and commit message. And I can't test the fix by myself as I don't know how to reproduce conditions for the bug. I think it's better if Gaurav will prepare this backport, unless he have reasons for me to do it or maybe just don't have enough time nowadays. Gaurav, if you want to somehow mention me you add my Reported-by: Thanks, Denis
On 1/23/2021 4:19 PM, Denis Efremov wrote: > > > On 1/22/21 5:37 PM, Steven Rostedt wrote: >> On Fri, 22 Jan 2021 16:55:29 +0530 >> Gaurav Kohli <gkohli@codeaurora.org> wrote: >> >>>>> That could possibly work. >>> >>> Yes, this will work, As i have tested similar patch for internal testing >>> for kernel branches like 5.4/4.19. >> >> Can you or Denis send a proper patch for Greg to backport? I'll review it, >> test it and give my ack to it, so Greg can take it without issue. >> > > I can prepare the patch, but it will be compile-tested only from my side. Honestly, > I think it's better when the patch and its backports have the same author and > commit message. And I can't test the fix by myself as I don't know how to reproduce > conditions for the bug. I think it's better if Gaurav will prepare this backport, > unless he have reasons for me to do it or maybe just don't have enough time nowadays. > Gaurav, if you want to somehow mention me you add my Reported-by: > > Thanks, > Denis > Sure I will do, I have never posted on backport branches. Let me check and post it.
On Sat, 23 Jan 2021 22:03:27 +0530 Gaurav Kohli <gkohli@codeaurora.org> wrote: > Sure I will do, I have never posted on backport branches. Let me check > and post it. > Basically you take your original patch that was in mainline (as the subject and commit message), and make it work as if you were doing the same exact fix for the stable release. Send it to me (and Cc everyone else), and I'll give it a test too. Thanks! -- Steve
On 1/24/2021 8:51 AM, Steven Rostedt wrote: > On Sat, 23 Jan 2021 22:03:27 +0530 > Gaurav Kohli <gkohli@codeaurora.org> wrote: > > >> Sure I will do, I have never posted on backport branches. Let me check >> and post it. >> > > Basically you take your original patch that was in mainline (as the > subject and commit message), and make it work as if you were doing the > same exact fix for the stable release. > > Send it to me (and Cc everyone else), and I'll give it a test too. Thanks for the guidance. Just sent and tested it for 5.4 kernel, please review it once. > Thanks! > > -- Steve > -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Sun, Jan 24, 2021 at 03:27:25PM +0530, Gaurav Kohli wrote: > > > On 1/24/2021 8:51 AM, Steven Rostedt wrote: > > On Sat, 23 Jan 2021 22:03:27 +0530 > > Gaurav Kohli <gkohli@codeaurora.org> wrote: > > > > > > > Sure I will do, I have never posted on backport branches. Let me check > > > and post it. > > > > > > > Basically you take your original patch that was in mainline (as the > > subject and commit message), and make it work as if you were doing the > > same exact fix for the stable release. > > > > Send it to me (and Cc everyone else), and I'll give it a test too. > > Thanks for the guidance. > Just sent and tested it for 5.4 kernel, please review it once. <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 93ef0ab..15bf28b 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -4866,6 +4866,9 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) if (!cpumask_test_cpu(cpu, buffer->cpumask)) return; + /* prevent another thread from changing buffer sizes */ + mutex_lock(&buffer->mutex); + atomic_inc(&cpu_buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled); @@ -4876,6 +4879,8 @@ void ring_buffer_reset_cpu(struct trace_buffer *buffer, int cpu) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled); + + mutex_unlock(&buffer->mutex); } EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu); @@ -4889,6 +4894,9 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) struct ring_buffer_per_cpu *cpu_buffer; int cpu; + /* prevent another thread from changing buffer sizes */ + mutex_lock(&buffer->mutex); + for_each_online_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; @@ -4907,6 +4915,8 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) atomic_dec(&cpu_buffer->record_disabled); atomic_dec(&cpu_buffer->resize_disabled); } + + mutex_unlock(&buffer->mutex); } /**
Below race can come, if trace_open and resize of cpu buffer is running parallely on different cpus CPUX CPUY ring_buffer_resize atomic_read(&buffer->resize_disabled) tracing_open tracing_reset_online_cpus ring_buffer_reset_cpu rb_reset_cpu rb_update_pages remove/insert pages resetting pointer This race can cause data abort or some times infinte loop in rb_remove_pages and rb_insert_pages while checking pages for sanity. Take buffer lock to fix this. Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org> Cc: stable@vger.kernel.org --- Changes since v0: -Addressed Steven's review comments.