diff mbox series

[RESEND,1/1] Bluetooth: Use chan_list_lock to protect the whole put/destroy invokation

Message ID 20220622082716.478486-1-lee.jones@linaro.org
State New
Headers show
Series [RESEND,1/1] Bluetooth: Use chan_list_lock to protect the whole put/destroy invokation | expand

Commit Message

Lee Jones June 22, 2022, 8:27 a.m. UTC
This change prevents a use-after-free caused by one of the worker
threads starting up (see below) *after* the final channel reference
has been put() during sock_close() but *before* the references to the
channel have been destroyed.

  refcount_t: increment on 0; use-after-free.
  BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
  Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705

  CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
  Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
  Workqueue: hci0 hci_rx_work
  Call trace:
   dump_backtrace+0x0/0x378
   show_stack+0x20/0x2c
   dump_stack+0x124/0x148
   print_address_description+0x80/0x2e8
   __kasan_report+0x168/0x188
   kasan_report+0x10/0x18
   __asan_load4+0x84/0x8c
   refcount_dec_and_test+0x20/0xd0
   l2cap_chan_put+0x48/0x12c
   l2cap_recv_frame+0x4770/0x6550
   l2cap_recv_acldata+0x44c/0x7a4
   hci_acldata_packet+0x100/0x188
   hci_rx_work+0x178/0x23c
   process_one_work+0x35c/0x95c
   worker_thread+0x4cc/0x960
   kthread+0x1a8/0x1c4
   ret_from_fork+0x10/0x18

Cc: stable@kernel.org
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: linux-bluetooth@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 net/bluetooth/l2cap_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

bluez.test.bot@gmail.com June 22, 2022, 9:15 a.m. UTC | #1
This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=652672

---Test result---

Test Summary:
CheckPatch                    FAIL      1.22 seconds
GitLint                       FAIL      0.73 seconds
SubjectPrefix                 PASS      0.63 seconds
BuildKernel                   PASS      41.39 seconds
BuildKernel32                 PASS      36.81 seconds
Incremental Build with patchesPASS      47.55 seconds
TestRunner: Setup             PASS      648.42 seconds
TestRunner: l2cap-tester      PASS      15.98 seconds
TestRunner: bnep-tester       PASS      5.24 seconds
TestRunner: mgmt-tester       PASS      96.63 seconds
TestRunner: rfcomm-tester     PASS      8.55 seconds
TestRunner: sco-tester        PASS      8.32 seconds
TestRunner: smp-tester        PASS      8.43 seconds
TestRunner: userchan-tester   PASS      5.41 seconds

Details
##############################
Test: CheckPatch - FAIL - 1.22 seconds
Run checkpatch.pl script with rule in .checkpatch.conf
[RESEND,1/1] Bluetooth: Use chan_list_lock to protect the whole put/destroy invokation\WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#96: 
  CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28

total: 0 errors, 1 warnings, 0 checks, 18 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/12890300.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.


##############################
Test: GitLint - FAIL - 0.73 seconds
Run gitlint with rule in .gitlint
[RESEND,1/1] Bluetooth: Use chan_list_lock to protect the whole put/destroy invokation
1: T1 Title exceeds max length (86>80): "[RESEND,1/1] Bluetooth: Use chan_list_lock to protect the whole put/destroy invokation"
12: B1 Line exceeds max length (103>80): "  CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28"
13: B1 Line exceeds max length (99>80): "  Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)"




---
Regards,
Linux Bluetooth
Lee Jones June 27, 2022, 2:17 p.m. UTC | #2
On Wed, 22 Jun 2022, Lee Jones wrote:

> This change prevents a use-after-free caused by one of the worker
> threads starting up (see below) *after* the final channel reference
> has been put() during sock_close() but *before* the references to the
> channel have been destroyed.
> 
>   refcount_t: increment on 0; use-after-free.
>   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
>   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> 
>   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
>   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
>   Workqueue: hci0 hci_rx_work
>   Call trace:
>    dump_backtrace+0x0/0x378
>    show_stack+0x20/0x2c
>    dump_stack+0x124/0x148
>    print_address_description+0x80/0x2e8
>    __kasan_report+0x168/0x188
>    kasan_report+0x10/0x18
>    __asan_load4+0x84/0x8c
>    refcount_dec_and_test+0x20/0xd0
>    l2cap_chan_put+0x48/0x12c
>    l2cap_recv_frame+0x4770/0x6550
>    l2cap_recv_acldata+0x44c/0x7a4
>    hci_acldata_packet+0x100/0x188
>    hci_rx_work+0x178/0x23c
>    process_one_work+0x35c/0x95c
>    worker_thread+0x4cc/0x960
>    kthread+0x1a8/0x1c4
>    ret_from_fork+0x10/0x18
> 
> Cc: stable@kernel.org
> Cc: Marcel Holtmann <marcel@holtmann.org>
> Cc: Johan Hedberg <johan.hedberg@gmail.com>
> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-bluetooth@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>  net/bluetooth/l2cap_core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

This patch now passes all of the CI tests.

Except the check-patch/lint tests which return false positives.

Please consider for inclusion to remedy this serious bug.

TIA.

> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index ae78490ecd3d4..82279c5919fd8 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
>  
>  	BT_DBG("chan %p", chan);
>  
> -	write_lock(&chan_list_lock);
>  	list_del(&chan->global_l);
> -	write_unlock(&chan_list_lock);
>  
>  	kfree(chan);
>  }
> @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
>  {
>  	BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
>  
> +	write_lock(&chan_list_lock);
>  	kref_put(&c->kref, l2cap_chan_destroy);
> +	write_unlock(&chan_list_lock);
>  }
>  EXPORT_SYMBOL_GPL(l2cap_chan_put);
>
Eric Dumazet June 27, 2022, 2:41 p.m. UTC | #3
On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
>
> This change prevents a use-after-free caused by one of the worker
> threads starting up (see below) *after* the final channel reference
> has been put() during sock_close() but *before* the references to the
> channel have been destroyed.
>
>   refcount_t: increment on 0; use-after-free.
>   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
>   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
>
>   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
>   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
>   Workqueue: hci0 hci_rx_work
>   Call trace:
>    dump_backtrace+0x0/0x378
>    show_stack+0x20/0x2c
>    dump_stack+0x124/0x148
>    print_address_description+0x80/0x2e8
>    __kasan_report+0x168/0x188
>    kasan_report+0x10/0x18
>    __asan_load4+0x84/0x8c
>    refcount_dec_and_test+0x20/0xd0
>    l2cap_chan_put+0x48/0x12c
>    l2cap_recv_frame+0x4770/0x6550
>    l2cap_recv_acldata+0x44c/0x7a4
>    hci_acldata_packet+0x100/0x188
>    hci_rx_work+0x178/0x23c
>    process_one_work+0x35c/0x95c
>    worker_thread+0x4cc/0x960
>    kthread+0x1a8/0x1c4
>    ret_from_fork+0x10/0x18
>
> Cc: stable@kernel.org

When was the bug added ? (Fixes: tag please)

> Cc: Marcel Holtmann <marcel@holtmann.org>
> Cc: Johan Hedberg <johan.hedberg@gmail.com>
> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: linux-bluetooth@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Lee Jones <lee.jones@linaro.org>
> ---
>  net/bluetooth/l2cap_core.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index ae78490ecd3d4..82279c5919fd8 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
>
>         BT_DBG("chan %p", chan);
>
> -       write_lock(&chan_list_lock);
>         list_del(&chan->global_l);
> -       write_unlock(&chan_list_lock);
>
>         kfree(chan);
>  }
> @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
>  {
>         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
>
> +       write_lock(&chan_list_lock);
>         kref_put(&c->kref, l2cap_chan_destroy);
> +       write_unlock(&chan_list_lock);
>  }
>  EXPORT_SYMBOL_GPL(l2cap_chan_put);
>
> --
> 2.36.1.255.ge46751e96f-goog
>

I do not think this patch is correct.

a kref does not need to be protected by a write lock.

This might shuffle things enough to work around a particular repro you have.

If the patch was correct why not protect kref_get() sides ?

Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
&hdev->rx_work),
a reference must be taken.

Then this reference must be released at the end of hci_rx_work() or
when hdev->workqueue
is canceled.

This refcount is not needed _if_ the workqueue is properly canceled at
device dismantle,
in a synchronous way.

I do not see this hdev->rx_work being canceled, maybe this is the real issue.

There is a call to drain_workqueue() but this is not enough I think,
because hci_recv_frame()
can re-arm
   queue_work(hdev->workqueue, &hdev->rx_work);
Luiz Augusto von Dentz June 27, 2022, 11:39 p.m. UTC | #4
Hi Eric, Lee,

On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
>
> On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> >
> > This change prevents a use-after-free caused by one of the worker
> > threads starting up (see below) *after* the final channel reference
> > has been put() during sock_close() but *before* the references to the
> > channel have been destroyed.
> >
> >   refcount_t: increment on 0; use-after-free.
> >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> >
> >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> >   Workqueue: hci0 hci_rx_work
> >   Call trace:
> >    dump_backtrace+0x0/0x378
> >    show_stack+0x20/0x2c
> >    dump_stack+0x124/0x148
> >    print_address_description+0x80/0x2e8
> >    __kasan_report+0x168/0x188
> >    kasan_report+0x10/0x18
> >    __asan_load4+0x84/0x8c
> >    refcount_dec_and_test+0x20/0xd0
> >    l2cap_chan_put+0x48/0x12c
> >    l2cap_recv_frame+0x4770/0x6550
> >    l2cap_recv_acldata+0x44c/0x7a4
> >    hci_acldata_packet+0x100/0x188
> >    hci_rx_work+0x178/0x23c
> >    process_one_work+0x35c/0x95c
> >    worker_thread+0x4cc/0x960
> >    kthread+0x1a8/0x1c4
> >    ret_from_fork+0x10/0x18
> >
> > Cc: stable@kernel.org
>
> When was the bug added ? (Fixes: tag please)
>
> > Cc: Marcel Holtmann <marcel@holtmann.org>
> > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Jakub Kicinski <kuba@kernel.org>
> > Cc: Paolo Abeni <pabeni@redhat.com>
> > Cc: linux-bluetooth@vger.kernel.org
> > Cc: netdev@vger.kernel.org
> > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > ---
> >  net/bluetooth/l2cap_core.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > index ae78490ecd3d4..82279c5919fd8 100644
> > --- a/net/bluetooth/l2cap_core.c
> > +++ b/net/bluetooth/l2cap_core.c
> > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> >
> >         BT_DBG("chan %p", chan);
> >
> > -       write_lock(&chan_list_lock);
> >         list_del(&chan->global_l);
> > -       write_unlock(&chan_list_lock);
> >
> >         kfree(chan);
> >  }
> > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> >  {
> >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> >
> > +       write_lock(&chan_list_lock);
> >         kref_put(&c->kref, l2cap_chan_destroy);
> > +       write_unlock(&chan_list_lock);
> >  }
> >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> >
> > --
> > 2.36.1.255.ge46751e96f-goog
> >
>
> I do not think this patch is correct.
>
> a kref does not need to be protected by a write lock.
>
> This might shuffle things enough to work around a particular repro you have.
>
> If the patch was correct why not protect kref_get() sides ?
>
> Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> &hdev->rx_work),
> a reference must be taken.
>
> Then this reference must be released at the end of hci_rx_work() or
> when hdev->workqueue
> is canceled.
>
> This refcount is not needed _if_ the workqueue is properly canceled at
> device dismantle,
> in a synchronous way.
>
> I do not see this hdev->rx_work being canceled, maybe this is the real issue.
>
> There is a call to drain_workqueue() but this is not enough I think,
> because hci_recv_frame()
> can re-arm
>    queue_work(hdev->workqueue, &hdev->rx_work);

I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:

/* Find channel with given SCID.
 * Returns locked channel. */
static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
*conn, u16 cid)

So we return a locked channel but that doesn't prevent another thread
to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
perhaps we actually need to host a reference while we have the lock,
at least we do something like that on l2cap_sock.c:

l2cap_chan_hold(chan);
l2cap_chan_lock(chan);

__clear_chan_timer(chan);
l2cap_chan_close(chan, ECONNRESET);
l2cap_sock_kill(sk);

l2cap_chan_unlock(chan);
l2cap_chan_put(chan);
Luiz Augusto von Dentz June 28, 2022, 6:36 p.m. UTC | #5
Hi Eric, Lee,

On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi Eric, Lee,
>
> On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > >
> > > This change prevents a use-after-free caused by one of the worker
> > > threads starting up (see below) *after* the final channel reference
> > > has been put() during sock_close() but *before* the references to the
> > > channel have been destroyed.
> > >
> > >   refcount_t: increment on 0; use-after-free.
> > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > >
> > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > >   Workqueue: hci0 hci_rx_work
> > >   Call trace:
> > >    dump_backtrace+0x0/0x378
> > >    show_stack+0x20/0x2c
> > >    dump_stack+0x124/0x148
> > >    print_address_description+0x80/0x2e8
> > >    __kasan_report+0x168/0x188
> > >    kasan_report+0x10/0x18
> > >    __asan_load4+0x84/0x8c
> > >    refcount_dec_and_test+0x20/0xd0
> > >    l2cap_chan_put+0x48/0x12c
> > >    l2cap_recv_frame+0x4770/0x6550
> > >    l2cap_recv_acldata+0x44c/0x7a4
> > >    hci_acldata_packet+0x100/0x188
> > >    hci_rx_work+0x178/0x23c
> > >    process_one_work+0x35c/0x95c
> > >    worker_thread+0x4cc/0x960
> > >    kthread+0x1a8/0x1c4
> > >    ret_from_fork+0x10/0x18
> > >
> > > Cc: stable@kernel.org
> >
> > When was the bug added ? (Fixes: tag please)
> >
> > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > Cc: "David S. Miller" <davem@davemloft.net>
> > > Cc: Eric Dumazet <edumazet@google.com>
> > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > Cc: linux-bluetooth@vger.kernel.org
> > > Cc: netdev@vger.kernel.org
> > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > ---
> > >  net/bluetooth/l2cap_core.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > index ae78490ecd3d4..82279c5919fd8 100644
> > > --- a/net/bluetooth/l2cap_core.c
> > > +++ b/net/bluetooth/l2cap_core.c
> > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > >
> > >         BT_DBG("chan %p", chan);
> > >
> > > -       write_lock(&chan_list_lock);
> > >         list_del(&chan->global_l);
> > > -       write_unlock(&chan_list_lock);
> > >
> > >         kfree(chan);
> > >  }
> > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > >  {
> > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > >
> > > +       write_lock(&chan_list_lock);
> > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > +       write_unlock(&chan_list_lock);
> > >  }
> > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > >
> > > --
> > > 2.36.1.255.ge46751e96f-goog
> > >
> >
> > I do not think this patch is correct.
> >
> > a kref does not need to be protected by a write lock.
> >
> > This might shuffle things enough to work around a particular repro you have.
> >
> > If the patch was correct why not protect kref_get() sides ?
> >
> > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > &hdev->rx_work),
> > a reference must be taken.
> >
> > Then this reference must be released at the end of hci_rx_work() or
> > when hdev->workqueue
> > is canceled.
> >
> > This refcount is not needed _if_ the workqueue is properly canceled at
> > device dismantle,
> > in a synchronous way.
> >
> > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> >
> > There is a call to drain_workqueue() but this is not enough I think,
> > because hci_recv_frame()
> > can re-arm
> >    queue_work(hdev->workqueue, &hdev->rx_work);
>
> I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
>
> /* Find channel with given SCID.
>  * Returns locked channel. */
> static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> *conn, u16 cid)
>
> So we return a locked channel but that doesn't prevent another thread
> to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> perhaps we actually need to host a reference while we have the lock,
> at least we do something like that on l2cap_sock.c:
>
> l2cap_chan_hold(chan);
> l2cap_chan_lock(chan);
>
> __clear_chan_timer(chan);
> l2cap_chan_close(chan, ECONNRESET);
> l2cap_sock_kill(sk);
>
> l2cap_chan_unlock(chan);
> l2cap_chan_put(chan);

Perhaps something like this:

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 09ecaf556de5..9050b6af3577 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -111,7 +111,7 @@ static struct l2cap_chan
*__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
 }

 /* Find channel with given SCID.
- * Returns locked channel. */
+ * Returns a reference locked channel. */
 static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
                                                 u16 cid)
 {
@@ -119,15 +119,17 @@ static struct l2cap_chan
*l2cap_get_chan_by_scid(struct l2cap_conn *conn,

        mutex_lock(&conn->chan_lock);
        c = __l2cap_get_chan_by_scid(conn, cid);
-       if (c)
+       if (c) {
+               l2cap_chan_hold(c);
                l2cap_chan_lock(c);
+       }
        mutex_unlock(&conn->chan_lock);

        return c;
 }

 /* Find channel with given DCID.
- * Returns locked channel.
+ * Returns a reference locked channel.
  */
 static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
                                                 u16 cid)
@@ -136,8 +138,10 @@ static struct l2cap_chan
*l2cap_get_chan_by_dcid(struct l2cap_conn *conn,

        mutex_lock(&conn->chan_lock);
        c = __l2cap_get_chan_by_dcid(conn, cid);
-       if (c)
+       if (c) {
+               l2cap_chan_hold(c);
                l2cap_chan_lock(c);
+       }
        mutex_unlock(&conn->chan_lock);

        return c;
@@ -4464,6 +4468,7 @@ static inline int l2cap_config_req(struct
l2cap_conn *conn,

 unlock:
        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);
        return err;
 }

@@ -4578,6 +4583,7 @@ static inline int l2cap_config_rsp(struct
l2cap_conn *conn,

 done:
        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);
        return err;
 }

@@ -5305,6 +5311,7 @@ static inline int l2cap_move_channel_req(struct
l2cap_conn *conn,
        l2cap_send_move_chan_rsp(chan, result);

        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);

        return 0;
 }
@@ -5397,6 +5404,7 @@ static void l2cap_move_continue(struct
l2cap_conn *conn, u16 icid, u16 result)
        }

        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);
 }

 static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
@@ -5489,6 +5497,7 @@ static int l2cap_move_channel_confirm(struct
l2cap_conn *conn,
        l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);

        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);

        return 0;
 }
@@ -5524,6 +5533,7 @@ static inline int
l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
        }

        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);

        return 0;
 }
@@ -5896,12 +5906,11 @@ static inline int l2cap_le_credits(struct
l2cap_conn *conn,
        if (credits > max_credits) {
                BT_ERR("LE credits overflow");
                l2cap_send_disconn_req(chan, ECONNRESET);
-               l2cap_chan_unlock(chan);

                /* Return 0 so that we don't trigger an unnecessary
                 * command reject packet.
                 */
-               return 0;
+               goto unlock;
        }

        chan->tx_credits += credits;
@@ -5912,7 +5921,9 @@ static inline int l2cap_le_credits(struct
l2cap_conn *conn,
        if (chan->tx_credits)
                chan->ops->resume(chan);

+unlock:
        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);

        return 0;
 }
@@ -7598,6 +7609,7 @@ static void l2cap_data_channel(struct l2cap_conn
*conn, u16 cid,

 done:
        l2cap_chan_unlock(chan);
+       l2cap_chan_put(chan);
 }

 static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,


>
> --
> Luiz Augusto von Dentz
Lee Jones June 29, 2022, 3:28 p.m. UTC | #6
On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:

> Hi Eric, Lee,
> 
> On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> >
> > Hi Eric, Lee,
> >
> > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > >
> > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > >
> > > > This change prevents a use-after-free caused by one of the worker
> > > > threads starting up (see below) *after* the final channel reference
> > > > has been put() during sock_close() but *before* the references to the
> > > > channel have been destroyed.
> > > >
> > > >   refcount_t: increment on 0; use-after-free.
> > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > >
> > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > >   Workqueue: hci0 hci_rx_work
> > > >   Call trace:
> > > >    dump_backtrace+0x0/0x378
> > > >    show_stack+0x20/0x2c
> > > >    dump_stack+0x124/0x148
> > > >    print_address_description+0x80/0x2e8
> > > >    __kasan_report+0x168/0x188
> > > >    kasan_report+0x10/0x18
> > > >    __asan_load4+0x84/0x8c
> > > >    refcount_dec_and_test+0x20/0xd0
> > > >    l2cap_chan_put+0x48/0x12c
> > > >    l2cap_recv_frame+0x4770/0x6550
> > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > >    hci_acldata_packet+0x100/0x188
> > > >    hci_rx_work+0x178/0x23c
> > > >    process_one_work+0x35c/0x95c
> > > >    worker_thread+0x4cc/0x960
> > > >    kthread+0x1a8/0x1c4
> > > >    ret_from_fork+0x10/0x18
> > > >
> > > > Cc: stable@kernel.org
> > >
> > > When was the bug added ? (Fixes: tag please)
> > >
> > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > Cc: linux-bluetooth@vger.kernel.org
> > > > Cc: netdev@vger.kernel.org
> > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > ---
> > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > --- a/net/bluetooth/l2cap_core.c
> > > > +++ b/net/bluetooth/l2cap_core.c
> > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > >
> > > >         BT_DBG("chan %p", chan);
> > > >
> > > > -       write_lock(&chan_list_lock);
> > > >         list_del(&chan->global_l);
> > > > -       write_unlock(&chan_list_lock);
> > > >
> > > >         kfree(chan);
> > > >  }
> > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > >  {
> > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > >
> > > > +       write_lock(&chan_list_lock);
> > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > +       write_unlock(&chan_list_lock);
> > > >  }
> > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > >
> > > >
> > >
> > > I do not think this patch is correct.
> > >
> > > a kref does not need to be protected by a write lock.
> > >
> > > This might shuffle things enough to work around a particular repro you have.
> > >
> > > If the patch was correct why not protect kref_get() sides ?
> > >
> > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > &hdev->rx_work),
> > > a reference must be taken.
> > >
> > > Then this reference must be released at the end of hci_rx_work() or
> > > when hdev->workqueue
> > > is canceled.
> > >
> > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > device dismantle,
> > > in a synchronous way.
> > >
> > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > >
> > > There is a call to drain_workqueue() but this is not enough I think,
> > > because hci_recv_frame()
> > > can re-arm
> > >    queue_work(hdev->workqueue, &hdev->rx_work);
> >
> > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> >
> > /* Find channel with given SCID.
> >  * Returns locked channel. */
> > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > *conn, u16 cid)
> >
> > So we return a locked channel but that doesn't prevent another thread
> > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > perhaps we actually need to host a reference while we have the lock,
> > at least we do something like that on l2cap_sock.c:
> >
> > l2cap_chan_hold(chan);
> > l2cap_chan_lock(chan);
> >
> > __clear_chan_timer(chan);
> > l2cap_chan_close(chan, ECONNRESET);
> > l2cap_sock_kill(sk);
> >
> > l2cap_chan_unlock(chan);
> > l2cap_chan_put(chan);
> 
> Perhaps something like this:

I'm struggling to apply this for test:

  "error: corrupt patch at line 6"

Please could you attach a `git format-patch`ed patch instead?

> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index 09ecaf556de5..9050b6af3577 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -111,7 +111,7 @@ static struct l2cap_chan
> *__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  }
> 
>  /* Find channel with given SCID.
> - * Returns locked channel. */
> + * Returns a reference locked channel. */
>  static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>                                                  u16 cid)
>  {
> @@ -119,15 +119,17 @@ static struct l2cap_chan
> *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
> 
>         mutex_lock(&conn->chan_lock);
>         c = __l2cap_get_chan_by_scid(conn, cid);
> -       if (c)
> +       if (c) {
> +               l2cap_chan_hold(c);
>                 l2cap_chan_lock(c);
> +       }
>         mutex_unlock(&conn->chan_lock);
> 
>         return c;
>  }
> 
>  /* Find channel with given DCID.
> - * Returns locked channel.
> + * Returns a reference locked channel.
>   */
>  static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
>                                                  u16 cid)
> @@ -136,8 +138,10 @@ static struct l2cap_chan
> *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
> 
>         mutex_lock(&conn->chan_lock);
>         c = __l2cap_get_chan_by_dcid(conn, cid);
> -       if (c)
> +       if (c) {
> +               l2cap_chan_hold(c);
>                 l2cap_chan_lock(c);
> +       }
>         mutex_unlock(&conn->chan_lock);
> 
>         return c;
> @@ -4464,6 +4468,7 @@ static inline int l2cap_config_req(struct
> l2cap_conn *conn,
> 
>  unlock:
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
>         return err;
>  }
> 
> @@ -4578,6 +4583,7 @@ static inline int l2cap_config_rsp(struct
> l2cap_conn *conn,
> 
>  done:
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
>         return err;
>  }
> 
> @@ -5305,6 +5311,7 @@ static inline int l2cap_move_channel_req(struct
> l2cap_conn *conn,
>         l2cap_send_move_chan_rsp(chan, result);
> 
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
> 
>         return 0;
>  }
> @@ -5397,6 +5404,7 @@ static void l2cap_move_continue(struct
> l2cap_conn *conn, u16 icid, u16 result)
>         }
> 
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
>  }
> 
>  static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
> @@ -5489,6 +5497,7 @@ static int l2cap_move_channel_confirm(struct
> l2cap_conn *conn,
>         l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);
> 
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
> 
>         return 0;
>  }
> @@ -5524,6 +5533,7 @@ static inline int
> l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
>         }
> 
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
> 
>         return 0;
>  }
> @@ -5896,12 +5906,11 @@ static inline int l2cap_le_credits(struct
> l2cap_conn *conn,
>         if (credits > max_credits) {
>                 BT_ERR("LE credits overflow");
>                 l2cap_send_disconn_req(chan, ECONNRESET);
> -               l2cap_chan_unlock(chan);
> 
>                 /* Return 0 so that we don't trigger an unnecessary
>                  * command reject packet.
>                  */
> -               return 0;
> +               goto unlock;
>         }
> 
>         chan->tx_credits += credits;
> @@ -5912,7 +5921,9 @@ static inline int l2cap_le_credits(struct
> l2cap_conn *conn,
>         if (chan->tx_credits)
>                 chan->ops->resume(chan);
> 
> +unlock:
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
> 
>         return 0;
>  }
> @@ -7598,6 +7609,7 @@ static void l2cap_data_channel(struct l2cap_conn
> *conn, u16 cid,
> 
>  done:
>         l2cap_chan_unlock(chan);
> +       l2cap_chan_put(chan);
>  }
> 
>  static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
> 
> 
> >
> 
> 
>
Luiz Augusto von Dentz July 5, 2022, 5:21 p.m. UTC | #7
Hi Lee,

On Wed, Jun 29, 2022 at 8:28 AM Lee Jones <lee.jones@linaro.org> wrote:
>
> On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:
>
> > Hi Eric, Lee,
> >
> > On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> > <luiz.dentz@gmail.com> wrote:
> > >
> > > Hi Eric, Lee,
> > >
> > > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > > >
> > > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > >
> > > > > This change prevents a use-after-free caused by one of the worker
> > > > > threads starting up (see below) *after* the final channel reference
> > > > > has been put() during sock_close() but *before* the references to the
> > > > > channel have been destroyed.
> > > > >
> > > > >   refcount_t: increment on 0; use-after-free.
> > > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > > >
> > > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > > >   Workqueue: hci0 hci_rx_work
> > > > >   Call trace:
> > > > >    dump_backtrace+0x0/0x378
> > > > >    show_stack+0x20/0x2c
> > > > >    dump_stack+0x124/0x148
> > > > >    print_address_description+0x80/0x2e8
> > > > >    __kasan_report+0x168/0x188
> > > > >    kasan_report+0x10/0x18
> > > > >    __asan_load4+0x84/0x8c
> > > > >    refcount_dec_and_test+0x20/0xd0
> > > > >    l2cap_chan_put+0x48/0x12c
> > > > >    l2cap_recv_frame+0x4770/0x6550
> > > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > > >    hci_acldata_packet+0x100/0x188
> > > > >    hci_rx_work+0x178/0x23c
> > > > >    process_one_work+0x35c/0x95c
> > > > >    worker_thread+0x4cc/0x960
> > > > >    kthread+0x1a8/0x1c4
> > > > >    ret_from_fork+0x10/0x18
> > > > >
> > > > > Cc: stable@kernel.org
> > > >
> > > > When was the bug added ? (Fixes: tag please)
> > > >
> > > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > > Cc: linux-bluetooth@vger.kernel.org
> > > > > Cc: netdev@vger.kernel.org
> > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > ---
> > > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > > --- a/net/bluetooth/l2cap_core.c
> > > > > +++ b/net/bluetooth/l2cap_core.c
> > > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > > >
> > > > >         BT_DBG("chan %p", chan);
> > > > >
> > > > > -       write_lock(&chan_list_lock);
> > > > >         list_del(&chan->global_l);
> > > > > -       write_unlock(&chan_list_lock);
> > > > >
> > > > >         kfree(chan);
> > > > >  }
> > > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > > >  {
> > > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > > >
> > > > > +       write_lock(&chan_list_lock);
> > > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > > +       write_unlock(&chan_list_lock);
> > > > >  }
> > > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > > >
> > > > >
> > > >
> > > > I do not think this patch is correct.
> > > >
> > > > a kref does not need to be protected by a write lock.
> > > >
> > > > This might shuffle things enough to work around a particular repro you have.
> > > >
> > > > If the patch was correct why not protect kref_get() sides ?
> > > >
> > > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > > &hdev->rx_work),
> > > > a reference must be taken.
> > > >
> > > > Then this reference must be released at the end of hci_rx_work() or
> > > > when hdev->workqueue
> > > > is canceled.
> > > >
> > > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > > device dismantle,
> > > > in a synchronous way.
> > > >
> > > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > > >
> > > > There is a call to drain_workqueue() but this is not enough I think,
> > > > because hci_recv_frame()
> > > > can re-arm
> > > >    queue_work(hdev->workqueue, &hdev->rx_work);
> > >
> > > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> > >
> > > /* Find channel with given SCID.
> > >  * Returns locked channel. */
> > > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > > *conn, u16 cid)
> > >
> > > So we return a locked channel but that doesn't prevent another thread
> > > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > > perhaps we actually need to host a reference while we have the lock,
> > > at least we do something like that on l2cap_sock.c:
> > >
> > > l2cap_chan_hold(chan);
> > > l2cap_chan_lock(chan);
> > >
> > > __clear_chan_timer(chan);
> > > l2cap_chan_close(chan, ECONNRESET);
> > > l2cap_sock_kill(sk);
> > >
> > > l2cap_chan_unlock(chan);
> > > l2cap_chan_put(chan);
> >
> > Perhaps something like this:
>
> I'm struggling to apply this for test:
>
>   "error: corrupt patch at line 6"

Check with the attached patch.
Lee Jones July 6, 2022, 10:53 a.m. UTC | #8
On Tue, 05 Jul 2022, Luiz Augusto von Dentz wrote:

> Hi Lee,
> 
> On Wed, Jun 29, 2022 at 8:28 AM Lee Jones <lee.jones@linaro.org> wrote:
> >
> > On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:
> >
> > > Hi Eric, Lee,
> > >
> > > On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> > > <luiz.dentz@gmail.com> wrote:
> > > >
> > > > Hi Eric, Lee,
> > > >
> > > > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > >
> > > > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > > >
> > > > > > This change prevents a use-after-free caused by one of the worker
> > > > > > threads starting up (see below) *after* the final channel reference
> > > > > > has been put() during sock_close() but *before* the references to the
> > > > > > channel have been destroyed.
> > > > > >
> > > > > >   refcount_t: increment on 0; use-after-free.
> > > > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > > > >
> > > > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > > > >   Workqueue: hci0 hci_rx_work
> > > > > >   Call trace:
> > > > > >    dump_backtrace+0x0/0x378
> > > > > >    show_stack+0x20/0x2c
> > > > > >    dump_stack+0x124/0x148
> > > > > >    print_address_description+0x80/0x2e8
> > > > > >    __kasan_report+0x168/0x188
> > > > > >    kasan_report+0x10/0x18
> > > > > >    __asan_load4+0x84/0x8c
> > > > > >    refcount_dec_and_test+0x20/0xd0
> > > > > >    l2cap_chan_put+0x48/0x12c
> > > > > >    l2cap_recv_frame+0x4770/0x6550
> > > > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > > > >    hci_acldata_packet+0x100/0x188
> > > > > >    hci_rx_work+0x178/0x23c
> > > > > >    process_one_work+0x35c/0x95c
> > > > > >    worker_thread+0x4cc/0x960
> > > > > >    kthread+0x1a8/0x1c4
> > > > > >    ret_from_fork+0x10/0x18
> > > > > >
> > > > > > Cc: stable@kernel.org
> > > > >
> > > > > When was the bug added ? (Fixes: tag please)
> > > > >
> > > > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > > > Cc: linux-bluetooth@vger.kernel.org
> > > > > > Cc: netdev@vger.kernel.org
> > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > ---
> > > > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > > > --- a/net/bluetooth/l2cap_core.c
> > > > > > +++ b/net/bluetooth/l2cap_core.c
> > > > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > > > >
> > > > > >         BT_DBG("chan %p", chan);
> > > > > >
> > > > > > -       write_lock(&chan_list_lock);
> > > > > >         list_del(&chan->global_l);
> > > > > > -       write_unlock(&chan_list_lock);
> > > > > >
> > > > > >         kfree(chan);
> > > > > >  }
> > > > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > > > >  {
> > > > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > > > >
> > > > > > +       write_lock(&chan_list_lock);
> > > > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > > > +       write_unlock(&chan_list_lock);
> > > > > >  }
> > > > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > > > >
> > > > > >
> > > > >
> > > > > I do not think this patch is correct.
> > > > >
> > > > > a kref does not need to be protected by a write lock.
> > > > >
> > > > > This might shuffle things enough to work around a particular repro you have.
> > > > >
> > > > > If the patch was correct why not protect kref_get() sides ?
> > > > >
> > > > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > > > &hdev->rx_work),
> > > > > a reference must be taken.
> > > > >
> > > > > Then this reference must be released at the end of hci_rx_work() or
> > > > > when hdev->workqueue
> > > > > is canceled.
> > > > >
> > > > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > > > device dismantle,
> > > > > in a synchronous way.
> > > > >
> > > > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > > > >
> > > > > There is a call to drain_workqueue() but this is not enough I think,
> > > > > because hci_recv_frame()
> > > > > can re-arm
> > > > >    queue_work(hdev->workqueue, &hdev->rx_work);
> > > >
> > > > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> > > >
> > > > /* Find channel with given SCID.
> > > >  * Returns locked channel. */
> > > > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > > > *conn, u16 cid)
> > > >
> > > > So we return a locked channel but that doesn't prevent another thread
> > > > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > > > perhaps we actually need to host a reference while we have the lock,
> > > > at least we do something like that on l2cap_sock.c:
> > > >
> > > > l2cap_chan_hold(chan);
> > > > l2cap_chan_lock(chan);
> > > >
> > > > __clear_chan_timer(chan);
> > > > l2cap_chan_close(chan, ECONNRESET);
> > > > l2cap_sock_kill(sk);
> > > >
> > > > l2cap_chan_unlock(chan);
> > > > l2cap_chan_put(chan);
> > >
> > > Perhaps something like this:
> >
> > I'm struggling to apply this for test:
> >
> >   "error: corrupt patch at line 6"
> 
> Check with the attached patch.

With the patch applied:

[  188.825418][   T75] refcount_t: addition on 0; use-after-free.
[  188.825418][   T75] refcount_t: addition on 0; use-after-free.
[  188.840629][   T75] WARNING: CPU: 5 PID: 75 at lib/refcount.c:25 refcount_warn_saturate+0x147/0x1b0
[  188.840629][   T75] WARNING: CPU: 5 PID: 75 at lib/refcount.c:25 refcount_warn_saturate+0x147/0x1b0
[  188.862642][   T75] Modules linked in:
[  188.862642][   T75] Modules linked in:
[  188.871686][   T75] CPU: 5 PID: 75 Comm: kworker/u17:0 Not tainted 5.18.0-00005-gc3401a7ad65f #8
[  188.871686][   T75] CPU: 5 PID: 75 Comm: kworker/u17:0 Not tainted 5.18.0-00005-gc3401a7ad65f #8
[  188.892806][   T75] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[  188.892806][   T75] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[  188.917398][   T75] Workqueue: hci0 hci_rx_work
[  188.917398][   T75] Workqueue: hci0 hci_rx_work
[  188.928515][   T75] RIP: 0010:refcount_warn_saturate+0x147/0x1b0
[  188.928515][   T75] RIP: 0010:refcount_warn_saturate+0x147/0x1b0
[  188.943176][   T75] Code: c7 e0 a1 70 85 31 c0 e8 d7 c2 e8 fe 0f 0b eb a1 e8 fe 33 15 ff c6 05 f9 e2 a5 04 01 48 c7 c7 80 a2 70 80
[  188.943176][   T75] Code: c7 e0 a1 70 85 31 c0 e8 d7 c2 e8 fe 0f 0b eb a1 e8 fe 33 15 ff c6 05 f9 e2 a5 04 01 48 c7 c7 80 a2 70 80
[  188.990053][   T75] RSP: 0018:ffffc9000156f800 EFLAGS: 00010246
[  188.990053][   T75] RSP: 0018:ffffc9000156f800 EFLAGS: 00010246
[  189.004337][   T75] RAX: 118d918bf1a47e00 RBX: 0000000000000002 RCX: ffff8881130ce000
[  189.004337][   T75] RAX: 118d918bf1a47e00 RBX: 0000000000000002 RCX: ffff8881130ce000
[  189.023131][   T75] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[  189.023131][   T75] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
[  189.042044][   T75] RBP: ffffc9000156f810 R08: ffffffff81574218 R09: ffffed107de165d1
[  189.042044][   T75] RBP: ffffc9000156f810 R08: ffffffff81574218 R09: ffffed107de165d1
[  189.060967][   T75] R10: ffffed107de165d1 R11: 0000000000000000 R12: 1ffff920002adf10
[  189.060967][   T75] R10: ffffed107de165d1 R11: 0000000000000000 R12: 1ffff920002adf10
[  189.079650][   T75] R13: dffffc0000000000 R14: 0000000000000002 R15: ffff888139224818
[  189.079650][   T75] R13: dffffc0000000000 R14: 0000000000000002 R15: ffff888139224818
[  189.098573][   T75] FS:  0000000000000000(0000) GS:ffff8883ef080000(0000) knlGS:0000000000000000
[  189.098573][   T75] FS:  0000000000000000(0000) GS:ffff8883ef080000(0000) knlGS:0000000000000000
[  189.119604][   T75] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  189.119604][   T75] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  189.135313][   T75] CR2: 00000000004a2e98 CR3: 000000000680f000 CR4: 0000000000350ea0
[  189.135313][   T75] CR2: 00000000004a2e98 CR3: 000000000680f000 CR4: 0000000000350ea0
[  189.154165][   T75] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  189.154165][   T75] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  189.173465][   T75] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  189.173465][   T75] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  189.192075][   T75] Call Trace:
[  189.192075][   T75] Call Trace:
[  189.199853][   T75]  <TASK>
[  189.199853][   T75]  <TASK>
[  189.206820][   T75]  l2cap_global_chan_by_psm+0x55a/0x5a0
[  189.206820][   T75]  l2cap_global_chan_by_psm+0x55a/0x5a0
[  189.219891][   T75]  ? l2cap_connect+0x12c0/0x12c0
[  189.219891][   T75]  ? l2cap_connect+0x12c0/0x12c0
[  189.231602][   T75]  ? __kfree_skb+0x13e/0x1c0
[  189.231602][   T75]  ? __kfree_skb+0x13e/0x1c0
[  189.242612][   T75]  ? l2cap_recv_frame+0xf5c/0x95c0
[  189.242612][   T75]  ? l2cap_recv_frame+0xf5c/0x95c0
[  189.254714][   T75]  ? skb_pull+0xde/0x150
[  189.254714][   T75]  ? skb_pull+0xde/0x150
[  189.264776][   T75]  l2cap_recv_frame+0x5bd/0x95c0
[  189.264776][   T75]  l2cap_recv_frame+0x5bd/0x95c0
[  189.276329][   T75]  ? debug_smp_processor_id+0x1c/0x20
[  189.276329][   T75]  ? debug_smp_processor_id+0x1c/0x20
[  189.288985][   T75]  ? update_cfs_rq_load_avg+0x412/0x4f0
[  189.288985][   T75]  ? update_cfs_rq_load_avg+0x412/0x4f0
[  189.302202][   T75]  ? l2cap_recv_acldata+0x1a60/0x1a60
[  189.302202][   T75]  ? l2cap_recv_acldata+0x1a60/0x1a60
[  189.314998][   T75]  ? __kasan_check_write+0x14/0x20
[  189.314998][   T75]  ? __kasan_check_write+0x14/0x20
[  189.326877][   T75]  ? __switch_to+0x617/0x1060
[  189.326877][   T75]  ? __switch_to+0x617/0x1060
[  189.337858][   T75]  ? __kasan_check_write+0x14/0x20
[  189.337858][   T75]  ? __kasan_check_write+0x14/0x20
[  189.349712][   T75]  ? _raw_spin_lock_irqsave+0xdc/0x1f0
[  189.349712][   T75]  ? _raw_spin_lock_irqsave+0xdc/0x1f0
[  189.362793][   T75]  ? compat_start_thread+0x20/0x20
[  189.362793][   T75]  ? compat_start_thread+0x20/0x20
[  189.375316][   T75]  l2cap_recv_acldata+0x5c9/0x1a60
[  189.375316][   T75]  l2cap_recv_acldata+0x5c9/0x1a60
[  189.387262][   T75]  ? hci_connect_sco+0x9b0/0x9b0
[  189.387262][   T75]  ? hci_connect_sco+0x9b0/0x9b0
[  189.398857][   T75]  hci_rx_work+0x54b/0x750
[  189.398857][   T75]  hci_rx_work+0x54b/0x750
[  189.409172][   T75]  process_one_work+0x6eb/0x1080
[  189.409172][   T75]  process_one_work+0x6eb/0x1080
[  189.420829][   T75]  worker_thread+0xb2b/0x13d0
[  189.420829][   T75]  worker_thread+0xb2b/0x13d0
[  189.431922][   T75]  kthread+0x2b1/0x2d0
[  189.431922][   T75]  kthread+0x2b1/0x2d0
[  189.441724][   T75]  ? process_one_work+0x1080/0x1080
[  189.441724][   T75]  ? process_one_work+0x1080/0x1080
[  189.454559][   T75]  ? kthread_blkcg+0xd0/0xd0
[  189.454559][   T75]  ? kthread_blkcg+0xd0/0xd0
[  189.465457][   T75]  ret_from_fork+0x1f/0x30
[  189.465457][   T75]  ret_from_fork+0x1f/0x30
[  189.475777][   T75]  </TASK>
[  189.475777][   T75]  </TASK>
[  189.482907][   T75] ---[ end trace 0000000000000000 ]---
[  189.482907][   T75] ---[ end trace 0000000000000000 ]---
[  189.496567][   T75] ------------[ cut here ]------------
[  189.496567][   T75] ------------[ cut here ]------------
[  189.510250][   T75] refcount_t: underflow; use-after-free.
[  189.510250][   T75] refcount_t: underflow; use-after-free.

> From 88cf6b4f2b0c9ed0bd7ef3b0d38574b412264609 Mon Sep 17 00:00:00 2001
> From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> Date: Tue, 28 Jun 2022 15:46:04 -0700
> Subject: [PATCH] Bluetooth: L2CAP: WIP
> 
> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> ---
>  net/bluetooth/l2cap_core.c | 25 +++++++++++++++++++------
>  1 file changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index 09ecaf556de5..359fb1ce4372 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -111,7 +111,8 @@ static struct l2cap_chan *__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  }
>  
>  /* Find channel with given SCID.
> - * Returns locked channel. */
> + * Returns a reference locked channel.
> + */
>  static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  						 u16 cid)
>  {
> @@ -119,15 +120,17 @@ static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  
>  	mutex_lock(&conn->chan_lock);
>  	c = __l2cap_get_chan_by_scid(conn, cid);
> -	if (c)
> +	if (c) {
> +		l2cap_chan_hold(c);
>  		l2cap_chan_lock(c);
> +	}
>  	mutex_unlock(&conn->chan_lock);
>  
>  	return c;
>  }
>  
>  /* Find channel with given DCID.
> - * Returns locked channel.
> + * Returns a reference locked channel.
>   */
>  static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
>  						 u16 cid)
> @@ -136,8 +139,10 @@ static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
>  
>  	mutex_lock(&conn->chan_lock);
>  	c = __l2cap_get_chan_by_dcid(conn, cid);
> -	if (c)
> +	if (c) {
> +		l2cap_chan_hold(c);
>  		l2cap_chan_lock(c);
> +	}
>  	mutex_unlock(&conn->chan_lock);
>  
>  	return c;
> @@ -4464,6 +4469,7 @@ static inline int l2cap_config_req(struct l2cap_conn *conn,
>  
>  unlock:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  	return err;
>  }
>  
> @@ -4578,6 +4584,7 @@ static inline int l2cap_config_rsp(struct l2cap_conn *conn,
>  
>  done:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  	return err;
>  }
>  
> @@ -5305,6 +5312,7 @@ static inline int l2cap_move_channel_req(struct l2cap_conn *conn,
>  	l2cap_send_move_chan_rsp(chan, result);
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5397,6 +5405,7 @@ static void l2cap_move_continue(struct l2cap_conn *conn, u16 icid, u16 result)
>  	}
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  }
>  
>  static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
> @@ -5489,6 +5498,7 @@ static int l2cap_move_channel_confirm(struct l2cap_conn *conn,
>  	l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5524,6 +5534,7 @@ static inline int l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
>  	}
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5896,12 +5907,11 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
>  	if (credits > max_credits) {
>  		BT_ERR("LE credits overflow");
>  		l2cap_send_disconn_req(chan, ECONNRESET);
> -		l2cap_chan_unlock(chan);
>  
>  		/* Return 0 so that we don't trigger an unnecessary
>  		 * command reject packet.
>  		 */
> -		return 0;
> +		goto unlock;
>  	}
>  
>  	chan->tx_credits += credits;
> @@ -5912,7 +5922,9 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
>  	if (chan->tx_credits)
>  		chan->ops->resume(chan);
>  
> +unlock:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -7598,6 +7610,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, u16 cid,
>  
>  done:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  }
>  
>  static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
Luiz Augusto von Dentz July 6, 2022, 8:36 p.m. UTC | #9
Hi Lee,

On Wed, Jul 6, 2022 at 3:53 AM Lee Jones <lee.jones@linaro.org> wrote:
>
> On Tue, 05 Jul 2022, Luiz Augusto von Dentz wrote:
>
> > Hi Lee,
> >
> > On Wed, Jun 29, 2022 at 8:28 AM Lee Jones <lee.jones@linaro.org> wrote:
> > >
> > > On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:
> > >
> > > > Hi Eric, Lee,
> > > >
> > > > On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> > > > <luiz.dentz@gmail.com> wrote:
> > > > >
> > > > > Hi Eric, Lee,
> > > > >
> > > > > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > > >
> > > > > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > > > >
> > > > > > > This change prevents a use-after-free caused by one of the worker
> > > > > > > threads starting up (see below) *after* the final channel reference
> > > > > > > has been put() during sock_close() but *before* the references to the
> > > > > > > channel have been destroyed.
> > > > > > >
> > > > > > >   refcount_t: increment on 0; use-after-free.
> > > > > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > > > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > > > > >
> > > > > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > > > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > > > > >   Workqueue: hci0 hci_rx_work
> > > > > > >   Call trace:
> > > > > > >    dump_backtrace+0x0/0x378
> > > > > > >    show_stack+0x20/0x2c
> > > > > > >    dump_stack+0x124/0x148
> > > > > > >    print_address_description+0x80/0x2e8
> > > > > > >    __kasan_report+0x168/0x188
> > > > > > >    kasan_report+0x10/0x18
> > > > > > >    __asan_load4+0x84/0x8c
> > > > > > >    refcount_dec_and_test+0x20/0xd0
> > > > > > >    l2cap_chan_put+0x48/0x12c
> > > > > > >    l2cap_recv_frame+0x4770/0x6550
> > > > > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > > > > >    hci_acldata_packet+0x100/0x188
> > > > > > >    hci_rx_work+0x178/0x23c
> > > > > > >    process_one_work+0x35c/0x95c
> > > > > > >    worker_thread+0x4cc/0x960
> > > > > > >    kthread+0x1a8/0x1c4
> > > > > > >    ret_from_fork+0x10/0x18
> > > > > > >
> > > > > > > Cc: stable@kernel.org
> > > > > >
> > > > > > When was the bug added ? (Fixes: tag please)
> > > > > >
> > > > > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > > > > Cc: linux-bluetooth@vger.kernel.org
> > > > > > > Cc: netdev@vger.kernel.org
> > > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > > ---
> > > > > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > > > > --- a/net/bluetooth/l2cap_core.c
> > > > > > > +++ b/net/bluetooth/l2cap_core.c
> > > > > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > > > > >
> > > > > > >         BT_DBG("chan %p", chan);
> > > > > > >
> > > > > > > -       write_lock(&chan_list_lock);
> > > > > > >         list_del(&chan->global_l);
> > > > > > > -       write_unlock(&chan_list_lock);
> > > > > > >
> > > > > > >         kfree(chan);
> > > > > > >  }
> > > > > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > > > > >  {
> > > > > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > > > > >
> > > > > > > +       write_lock(&chan_list_lock);
> > > > > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > > > > +       write_unlock(&chan_list_lock);
> > > > > > >  }
> > > > > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > I do not think this patch is correct.
> > > > > >
> > > > > > a kref does not need to be protected by a write lock.
> > > > > >
> > > > > > This might shuffle things enough to work around a particular repro you have.
> > > > > >
> > > > > > If the patch was correct why not protect kref_get() sides ?
> > > > > >
> > > > > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > > > > &hdev->rx_work),
> > > > > > a reference must be taken.
> > > > > >
> > > > > > Then this reference must be released at the end of hci_rx_work() or
> > > > > > when hdev->workqueue
> > > > > > is canceled.
> > > > > >
> > > > > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > > > > device dismantle,
> > > > > > in a synchronous way.
> > > > > >
> > > > > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > > > > >
> > > > > > There is a call to drain_workqueue() but this is not enough I think,
> > > > > > because hci_recv_frame()
> > > > > > can re-arm
> > > > > >    queue_work(hdev->workqueue, &hdev->rx_work);
> > > > >
> > > > > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> > > > >
> > > > > /* Find channel with given SCID.
> > > > >  * Returns locked channel. */
> > > > > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > > > > *conn, u16 cid)
> > > > >
> > > > > So we return a locked channel but that doesn't prevent another thread
> > > > > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > > > > perhaps we actually need to host a reference while we have the lock,
> > > > > at least we do something like that on l2cap_sock.c:
> > > > >
> > > > > l2cap_chan_hold(chan);
> > > > > l2cap_chan_lock(chan);
> > > > >
> > > > > __clear_chan_timer(chan);
> > > > > l2cap_chan_close(chan, ECONNRESET);
> > > > > l2cap_sock_kill(sk);
> > > > >
> > > > > l2cap_chan_unlock(chan);
> > > > > l2cap_chan_put(chan);
> > > >
> > > > Perhaps something like this:
> > >
> > > I'm struggling to apply this for test:
> > >
> > >   "error: corrupt patch at line 6"
> >
> > Check with the attached patch.
>
> With the patch applied:
>
> [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> [  188.825418][   T75] refcount_t: addition on 0; use-after-free.

Looks like the changes just make the issue more visible since we are
trying to add a refcount when it is already 0 so this proves the
design is not quite right since it is removing the object from the
list only when destroying it while we probably need to do it before.

How about we use kref_get_unless_zero as it appears it was introduced
exactly for such cases (patch attached.)

Luiz Augusto von Dentz
Luiz Augusto von Dentz July 6, 2022, 8:58 p.m. UTC | #10
Hi,

On Wed, Jul 6, 2022 at 1:36 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi Lee,
>
> On Wed, Jul 6, 2022 at 3:53 AM Lee Jones <lee.jones@linaro.org> wrote:
> >
> > On Tue, 05 Jul 2022, Luiz Augusto von Dentz wrote:
> >
> > > Hi Lee,
> > >
> > > On Wed, Jun 29, 2022 at 8:28 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > >
> > > > On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:
> > > >
> > > > > Hi Eric, Lee,
> > > > >
> > > > > On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> > > > > <luiz.dentz@gmail.com> wrote:
> > > > > >
> > > > > > Hi Eric, Lee,
> > > > > >
> > > > > > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > > > > >
> > > > > > > > This change prevents a use-after-free caused by one of the worker
> > > > > > > > threads starting up (see below) *after* the final channel reference
> > > > > > > > has been put() during sock_close() but *before* the references to the
> > > > > > > > channel have been destroyed.
> > > > > > > >
> > > > > > > >   refcount_t: increment on 0; use-after-free.
> > > > > > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > > > > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > > > > > >
> > > > > > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > > > > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > > > > > >   Workqueue: hci0 hci_rx_work
> > > > > > > >   Call trace:
> > > > > > > >    dump_backtrace+0x0/0x378
> > > > > > > >    show_stack+0x20/0x2c
> > > > > > > >    dump_stack+0x124/0x148
> > > > > > > >    print_address_description+0x80/0x2e8
> > > > > > > >    __kasan_report+0x168/0x188
> > > > > > > >    kasan_report+0x10/0x18
> > > > > > > >    __asan_load4+0x84/0x8c
> > > > > > > >    refcount_dec_and_test+0x20/0xd0
> > > > > > > >    l2cap_chan_put+0x48/0x12c
> > > > > > > >    l2cap_recv_frame+0x4770/0x6550
> > > > > > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > > > > > >    hci_acldata_packet+0x100/0x188
> > > > > > > >    hci_rx_work+0x178/0x23c
> > > > > > > >    process_one_work+0x35c/0x95c
> > > > > > > >    worker_thread+0x4cc/0x960
> > > > > > > >    kthread+0x1a8/0x1c4
> > > > > > > >    ret_from_fork+0x10/0x18
> > > > > > > >
> > > > > > > > Cc: stable@kernel.org
> > > > > > >
> > > > > > > When was the bug added ? (Fixes: tag please)
> > > > > > >
> > > > > > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > > > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > > > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > > > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > > > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > > > > > Cc: linux-bluetooth@vger.kernel.org
> > > > > > > > Cc: netdev@vger.kernel.org
> > > > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > > > ---
> > > > > > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > > > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > > > > > --- a/net/bluetooth/l2cap_core.c
> > > > > > > > +++ b/net/bluetooth/l2cap_core.c
> > > > > > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > > > > > >
> > > > > > > >         BT_DBG("chan %p", chan);
> > > > > > > >
> > > > > > > > -       write_lock(&chan_list_lock);
> > > > > > > >         list_del(&chan->global_l);
> > > > > > > > -       write_unlock(&chan_list_lock);
> > > > > > > >
> > > > > > > >         kfree(chan);
> > > > > > > >  }
> > > > > > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > > > > > >  {
> > > > > > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > > > > > >
> > > > > > > > +       write_lock(&chan_list_lock);
> > > > > > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > > > > > +       write_unlock(&chan_list_lock);
> > > > > > > >  }
> > > > > > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > I do not think this patch is correct.
> > > > > > >
> > > > > > > a kref does not need to be protected by a write lock.
> > > > > > >
> > > > > > > This might shuffle things enough to work around a particular repro you have.
> > > > > > >
> > > > > > > If the patch was correct why not protect kref_get() sides ?
> > > > > > >
> > > > > > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > > > > > &hdev->rx_work),
> > > > > > > a reference must be taken.
> > > > > > >
> > > > > > > Then this reference must be released at the end of hci_rx_work() or
> > > > > > > when hdev->workqueue
> > > > > > > is canceled.
> > > > > > >
> > > > > > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > > > > > device dismantle,
> > > > > > > in a synchronous way.
> > > > > > >
> > > > > > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > > > > > >
> > > > > > > There is a call to drain_workqueue() but this is not enough I think,
> > > > > > > because hci_recv_frame()
> > > > > > > can re-arm
> > > > > > >    queue_work(hdev->workqueue, &hdev->rx_work);
> > > > > >
> > > > > > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> > > > > >
> > > > > > /* Find channel with given SCID.
> > > > > >  * Returns locked channel. */
> > > > > > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > > > > > *conn, u16 cid)
> > > > > >
> > > > > > So we return a locked channel but that doesn't prevent another thread
> > > > > > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > > > > > perhaps we actually need to host a reference while we have the lock,
> > > > > > at least we do something like that on l2cap_sock.c:
> > > > > >
> > > > > > l2cap_chan_hold(chan);
> > > > > > l2cap_chan_lock(chan);
> > > > > >
> > > > > > __clear_chan_timer(chan);
> > > > > > l2cap_chan_close(chan, ECONNRESET);
> > > > > > l2cap_sock_kill(sk);
> > > > > >
> > > > > > l2cap_chan_unlock(chan);
> > > > > > l2cap_chan_put(chan);
> > > > >
> > > > > Perhaps something like this:
> > > >
> > > > I'm struggling to apply this for test:
> > > >
> > > >   "error: corrupt patch at line 6"
> > >
> > > Check with the attached patch.
> >
> > With the patch applied:
> >
> > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
>
> Looks like the changes just make the issue more visible since we are
> trying to add a refcount when it is already 0 so this proves the
> design is not quite right since it is removing the object from the
> list only when destroying it while we probably need to do it before.
>
> How about we use kref_get_unless_zero as it appears it was introduced
> exactly for such cases (patch attached.)

Looks like I missed a few places like l2cap_global_chan_by_psm so here
is another version.

> Luiz Augusto von Dentz
Luiz Augusto von Dentz July 14, 2022, 5:46 p.m. UTC | #11
Hi Lee,

On Wed, Jul 6, 2022 at 1:58 PM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi,
>
> On Wed, Jul 6, 2022 at 1:36 PM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
> >
> > Hi Lee,
> >
> > On Wed, Jul 6, 2022 at 3:53 AM Lee Jones <lee.jones@linaro.org> wrote:
> > >
> > > On Tue, 05 Jul 2022, Luiz Augusto von Dentz wrote:
> > >
> > > > Hi Lee,
> > > >
> > > > On Wed, Jun 29, 2022 at 8:28 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > >
> > > > > On Tue, 28 Jun 2022, Luiz Augusto von Dentz wrote:
> > > > >
> > > > > > Hi Eric, Lee,
> > > > > >
> > > > > > On Mon, Jun 27, 2022 at 4:39 PM Luiz Augusto von Dentz
> > > > > > <luiz.dentz@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi Eric, Lee,
> > > > > > >
> > > > > > > On Mon, Jun 27, 2022 at 7:41 AM Eric Dumazet <edumazet@google.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jun 22, 2022 at 10:27 AM Lee Jones <lee.jones@linaro.org> wrote:
> > > > > > > > >
> > > > > > > > > This change prevents a use-after-free caused by one of the worker
> > > > > > > > > threads starting up (see below) *after* the final channel reference
> > > > > > > > > has been put() during sock_close() but *before* the references to the
> > > > > > > > > channel have been destroyed.
> > > > > > > > >
> > > > > > > > >   refcount_t: increment on 0; use-after-free.
> > > > > > > > >   BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
> > > > > > > > >   Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
> > > > > > > > >
> > > > > > > > >   CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S      W       4.14.234-00003-g1fb6d0bd49a4-dirty #28
> > > > > > > > >   Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT)
> > > > > > > > >   Workqueue: hci0 hci_rx_work
> > > > > > > > >   Call trace:
> > > > > > > > >    dump_backtrace+0x0/0x378
> > > > > > > > >    show_stack+0x20/0x2c
> > > > > > > > >    dump_stack+0x124/0x148
> > > > > > > > >    print_address_description+0x80/0x2e8
> > > > > > > > >    __kasan_report+0x168/0x188
> > > > > > > > >    kasan_report+0x10/0x18
> > > > > > > > >    __asan_load4+0x84/0x8c
> > > > > > > > >    refcount_dec_and_test+0x20/0xd0
> > > > > > > > >    l2cap_chan_put+0x48/0x12c
> > > > > > > > >    l2cap_recv_frame+0x4770/0x6550
> > > > > > > > >    l2cap_recv_acldata+0x44c/0x7a4
> > > > > > > > >    hci_acldata_packet+0x100/0x188
> > > > > > > > >    hci_rx_work+0x178/0x23c
> > > > > > > > >    process_one_work+0x35c/0x95c
> > > > > > > > >    worker_thread+0x4cc/0x960
> > > > > > > > >    kthread+0x1a8/0x1c4
> > > > > > > > >    ret_from_fork+0x10/0x18
> > > > > > > > >
> > > > > > > > > Cc: stable@kernel.org
> > > > > > > >
> > > > > > > > When was the bug added ? (Fixes: tag please)
> > > > > > > >
> > > > > > > > > Cc: Marcel Holtmann <marcel@holtmann.org>
> > > > > > > > > Cc: Johan Hedberg <johan.hedberg@gmail.com>
> > > > > > > > > Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> > > > > > > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > > > > > > Cc: Eric Dumazet <edumazet@google.com>
> > > > > > > > > Cc: Jakub Kicinski <kuba@kernel.org>
> > > > > > > > > Cc: Paolo Abeni <pabeni@redhat.com>
> > > > > > > > > Cc: linux-bluetooth@vger.kernel.org
> > > > > > > > > Cc: netdev@vger.kernel.org
> > > > > > > > > Signed-off-by: Lee Jones <lee.jones@linaro.org>
> > > > > > > > > ---
> > > > > > > > >  net/bluetooth/l2cap_core.c | 4 ++--
> > > > > > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > > > > > > > > index ae78490ecd3d4..82279c5919fd8 100644
> > > > > > > > > --- a/net/bluetooth/l2cap_core.c
> > > > > > > > > +++ b/net/bluetooth/l2cap_core.c
> > > > > > > > > @@ -483,9 +483,7 @@ static void l2cap_chan_destroy(struct kref *kref)
> > > > > > > > >
> > > > > > > > >         BT_DBG("chan %p", chan);
> > > > > > > > >
> > > > > > > > > -       write_lock(&chan_list_lock);
> > > > > > > > >         list_del(&chan->global_l);
> > > > > > > > > -       write_unlock(&chan_list_lock);
> > > > > > > > >
> > > > > > > > >         kfree(chan);
> > > > > > > > >  }
> > > > > > > > > @@ -501,7 +499,9 @@ void l2cap_chan_put(struct l2cap_chan *c)
> > > > > > > > >  {
> > > > > > > > >         BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > > > > > > > >
> > > > > > > > > +       write_lock(&chan_list_lock);
> > > > > > > > >         kref_put(&c->kref, l2cap_chan_destroy);
> > > > > > > > > +       write_unlock(&chan_list_lock);
> > > > > > > > >  }
> > > > > > > > >  EXPORT_SYMBOL_GPL(l2cap_chan_put);
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > I do not think this patch is correct.
> > > > > > > >
> > > > > > > > a kref does not need to be protected by a write lock.
> > > > > > > >
> > > > > > > > This might shuffle things enough to work around a particular repro you have.
> > > > > > > >
> > > > > > > > If the patch was correct why not protect kref_get() sides ?
> > > > > > > >
> > > > > > > > Before the &hdev->rx_work is scheduled (queue_work(hdev->workqueue,
> > > > > > > > &hdev->rx_work),
> > > > > > > > a reference must be taken.
> > > > > > > >
> > > > > > > > Then this reference must be released at the end of hci_rx_work() or
> > > > > > > > when hdev->workqueue
> > > > > > > > is canceled.
> > > > > > > >
> > > > > > > > This refcount is not needed _if_ the workqueue is properly canceled at
> > > > > > > > device dismantle,
> > > > > > > > in a synchronous way.
> > > > > > > >
> > > > > > > > I do not see this hdev->rx_work being canceled, maybe this is the real issue.
> > > > > > > >
> > > > > > > > There is a call to drain_workqueue() but this is not enough I think,
> > > > > > > > because hci_recv_frame()
> > > > > > > > can re-arm
> > > > > > > >    queue_work(hdev->workqueue, &hdev->rx_work);
> > > > > > >
> > > > > > > I suspect this likely a refcount problem, we do l2cap_get_chan_by_scid:
> > > > > > >
> > > > > > > /* Find channel with given SCID.
> > > > > > >  * Returns locked channel. */
> > > > > > > static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn
> > > > > > > *conn, u16 cid)
> > > > > > >
> > > > > > > So we return a locked channel but that doesn't prevent another thread
> > > > > > > to call l2cap_chan_put which doesn't care about l2cap_chan_lock so
> > > > > > > perhaps we actually need to host a reference while we have the lock,
> > > > > > > at least we do something like that on l2cap_sock.c:
> > > > > > >
> > > > > > > l2cap_chan_hold(chan);
> > > > > > > l2cap_chan_lock(chan);
> > > > > > >
> > > > > > > __clear_chan_timer(chan);
> > > > > > > l2cap_chan_close(chan, ECONNRESET);
> > > > > > > l2cap_sock_kill(sk);
> > > > > > >
> > > > > > > l2cap_chan_unlock(chan);
> > > > > > > l2cap_chan_put(chan);
> > > > > >
> > > > > > Perhaps something like this:
> > > > >
> > > > > I'm struggling to apply this for test:
> > > > >
> > > > >   "error: corrupt patch at line 6"
> > > >
> > > > Check with the attached patch.
> > >
> > > With the patch applied:
> > >
> > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> >
> > Looks like the changes just make the issue more visible since we are
> > trying to add a refcount when it is already 0 so this proves the
> > design is not quite right since it is removing the object from the
> > list only when destroying it while we probably need to do it before.
> >
> > How about we use kref_get_unless_zero as it appears it was introduced
> > exactly for such cases (patch attached.)
>
> Looks like I missed a few places like l2cap_global_chan_by_psm so here
> is another version.

Any feedback regarding these changes?

> > Luiz Augusto von Dentz
>
>
>
> --
> Luiz Augusto von Dentz
Lee Jones July 15, 2022, 7:28 a.m. UTC | #12
> Hi Lee,
> > > > > > I'm struggling to apply this for test:
> > > > > >
> > > > > >   "error: corrupt patch at line 6"
> > > > >
> > > > > Check with the attached patch.
> > > >
> > > > With the patch applied:
> > > >
> > > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > >
> > > Looks like the changes just make the issue more visible since we are
> > > trying to add a refcount when it is already 0 so this proves the
> > > design is not quite right since it is removing the object from the
> > > list only when destroying it while we probably need to do it before.
> > >
> > > How about we use kref_get_unless_zero as it appears it was introduced
> > > exactly for such cases (patch attached.)
> >
> > Looks like I missed a few places like l2cap_global_chan_by_psm so here
> > is another version.
> 
> Any feedback regarding these changes?

Not yet.  I'll have time to test this next week.

Things really stacked up this week, apologies.
Lee Jones July 20, 2022, 11:52 a.m. UTC | #13
On Wed, 06 Jul 2022, Luiz Augusto von Dentz wrote:
> > > > > > Perhaps something like this:
> > > > >
> > > > > I'm struggling to apply this for test:
> > > > >
> > > > >   "error: corrupt patch at line 6"
> > > >
> > > > Check with the attached patch.
> > >
> > > With the patch applied:
> > >
> > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> >
> > Looks like the changes just make the issue more visible since we are
> > trying to add a refcount when it is already 0 so this proves the
> > design is not quite right since it is removing the object from the
> > list only when destroying it while we probably need to do it before.
> >
> > How about we use kref_get_unless_zero as it appears it was introduced
> > exactly for such cases (patch attached.)
> 
> Looks like I missed a few places like l2cap_global_chan_by_psm so here
> is another version.

Okay, with the patch below the kernel doesn't produce a back-trace.

Only this, which I assume is expected?

[  535.398255][  T495] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1                                                          
[  535.398255][  T495] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1                                                          
[  535.417007][  T495] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9                                                          
[  535.417007][  T495] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9                                                          
[  535.434810][  T495] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9                                                          
[  535.434810][  T495] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9                                                          
[  535.452886][  T495] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4                                                          
[  535.452886][  T495] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4                                                          
[  535.470574][  T495] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3                                                          
[  535.470574][  T495] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3                                                          
[  535.488009][  T495] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2                                                          
[  535.488009][  T495] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2                                                          
[  537.551677][   T74] Bluetooth: hci0: command 0x0409 tx timeout                                                                     
[  537.551677][   T74] Bluetooth: hci0: command 0x0409 tx timeout                                                                     
[  539.641362][  T373] Bluetooth: hci0: command 0x041b tx timeout                                                                     
[  539.641362][  T373] Bluetooth: hci0: command 0x041b tx timeout                                                                     
[  541.711056][  T274] Bluetooth: hci0: command 0x040f tx timeout                                                                     
[  541.711056][  T274] Bluetooth: hci0: command 0x040f tx timeout                                                                     
[  543.790939][   T66] Bluetooth: hci0: command 0x0419 tx timeout                                                                     
[  543.790939][   T66] Bluetooth: hci0: command 0x0419 tx timeout

> From 235937ac7a39d16e5dabbfca0ac1d58e4cc814d9 Mon Sep 17 00:00:00 2001
> From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> Date: Tue, 28 Jun 2022 15:46:04 -0700
> Subject: [PATCH] Bluetooth: L2CAP: WIP
> 
> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> ---
>  include/net/bluetooth/l2cap.h |  1 +
>  net/bluetooth/l2cap_core.c    | 58 +++++++++++++++++++++++++++--------
>  2 files changed, 46 insertions(+), 13 deletions(-)
> 
> diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
> index 3c4f550e5a8b..2f766e3437ce 100644
> --- a/include/net/bluetooth/l2cap.h
> +++ b/include/net/bluetooth/l2cap.h
> @@ -847,6 +847,7 @@ enum {
>  };
>  
>  void l2cap_chan_hold(struct l2cap_chan *c);
> +struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c);
>  void l2cap_chan_put(struct l2cap_chan *c);
>  
>  static inline void l2cap_chan_lock(struct l2cap_chan *chan)
> diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> index 09ecaf556de5..3e5d81e971cc 100644
> --- a/net/bluetooth/l2cap_core.c
> +++ b/net/bluetooth/l2cap_core.c
> @@ -111,7 +111,8 @@ static struct l2cap_chan *__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  }
>  
>  /* Find channel with given SCID.
> - * Returns locked channel. */
> + * Returns a reference locked channel.
> + */
>  static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  						 u16 cid)
>  {
> @@ -119,15 +120,19 @@ static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
>  
>  	mutex_lock(&conn->chan_lock);
>  	c = __l2cap_get_chan_by_scid(conn, cid);
> -	if (c)
> -		l2cap_chan_lock(c);
> +	if (c) {
> +		/* Only lock if chan reference is not 0 */
> +		c = l2cap_chan_hold_unless_zero(c);
> +		if (c)
> +			l2cap_chan_lock(c);
> +	}
>  	mutex_unlock(&conn->chan_lock);
>  
>  	return c;
>  }
>  
>  /* Find channel with given DCID.
> - * Returns locked channel.
> + * Returns a reference locked channel.
>   */
>  static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
>  						 u16 cid)
> @@ -136,8 +141,12 @@ static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
>  
>  	mutex_lock(&conn->chan_lock);
>  	c = __l2cap_get_chan_by_dcid(conn, cid);
> -	if (c)
> -		l2cap_chan_lock(c);
> +	if (c) {
> +		/* Only lock if chan reference is not 0 */
> +		c = l2cap_chan_hold_unless_zero(c);
> +		if (c)
> +			l2cap_chan_lock(c);
> +	}
>  	mutex_unlock(&conn->chan_lock);
>  
>  	return c;
> @@ -162,8 +171,12 @@ static struct l2cap_chan *l2cap_get_chan_by_ident(struct l2cap_conn *conn,
>  
>  	mutex_lock(&conn->chan_lock);
>  	c = __l2cap_get_chan_by_ident(conn, ident);
> -	if (c)
> -		l2cap_chan_lock(c);
> +	if (c) {
> +		/* Only lock if chan reference is not 0 */
> +		c = l2cap_chan_hold_unless_zero(c);
> +		if (c)
> +			l2cap_chan_lock(c);
> +	}
>  	mutex_unlock(&conn->chan_lock);
>  
>  	return c;
> @@ -497,6 +510,16 @@ void l2cap_chan_hold(struct l2cap_chan *c)
>  	kref_get(&c->kref);
>  }
>  
> +struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c)
> +{
> +	BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> +
> +	if (!kref_get_unless_zero(&c->kref))
> +		return NULL;
> +
> +	return c;
> +}
> +
>  void l2cap_chan_put(struct l2cap_chan *c)
>  {
>  	BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> @@ -1969,7 +1992,7 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
>  			src_match = !bacmp(&c->src, src);
>  			dst_match = !bacmp(&c->dst, dst);
>  			if (src_match && dst_match) {
> -				l2cap_chan_hold(c);
> +				c = l2cap_chan_hold_unless_zero(c);
>  				read_unlock(&chan_list_lock);
>  				return c;
>  			}
> @@ -1984,7 +2007,7 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
>  	}
>  
>  	if (c1)
> -		l2cap_chan_hold(c1);
> +		c1 = l2cap_chan_hold_unless_zero(c1);
>  
>  	read_unlock(&chan_list_lock);
>  
> @@ -4464,6 +4487,7 @@ static inline int l2cap_config_req(struct l2cap_conn *conn,
>  
>  unlock:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  	return err;
>  }
>  
> @@ -4578,6 +4602,7 @@ static inline int l2cap_config_rsp(struct l2cap_conn *conn,
>  
>  done:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  	return err;
>  }
>  
> @@ -5305,6 +5330,7 @@ static inline int l2cap_move_channel_req(struct l2cap_conn *conn,
>  	l2cap_send_move_chan_rsp(chan, result);
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5397,6 +5423,7 @@ static void l2cap_move_continue(struct l2cap_conn *conn, u16 icid, u16 result)
>  	}
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  }
>  
>  static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
> @@ -5426,6 +5453,7 @@ static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
>  	l2cap_send_move_chan_cfm(chan, L2CAP_MC_UNCONFIRMED);
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  }
>  
>  static int l2cap_move_channel_rsp(struct l2cap_conn *conn,
> @@ -5489,6 +5517,7 @@ static int l2cap_move_channel_confirm(struct l2cap_conn *conn,
>  	l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5524,6 +5553,7 @@ static inline int l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
>  	}
>  
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -5896,12 +5926,11 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
>  	if (credits > max_credits) {
>  		BT_ERR("LE credits overflow");
>  		l2cap_send_disconn_req(chan, ECONNRESET);
> -		l2cap_chan_unlock(chan);
>  
>  		/* Return 0 so that we don't trigger an unnecessary
>  		 * command reject packet.
>  		 */
> -		return 0;
> +		goto unlock;
>  	}
>  
>  	chan->tx_credits += credits;
> @@ -5912,7 +5941,9 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
>  	if (chan->tx_credits)
>  		chan->ops->resume(chan);
>  
> +unlock:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  
>  	return 0;
>  }
> @@ -7598,6 +7629,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, u16 cid,
>  
>  done:
>  	l2cap_chan_unlock(chan);
> +	l2cap_chan_put(chan);
>  }
>  
>  static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
> @@ -8086,7 +8118,7 @@ static struct l2cap_chan *l2cap_global_fixed_chan(struct l2cap_chan *c,
>  		if (src_type != c->src_type)
>  			continue;
>  
> -		l2cap_chan_hold(c);
> +		c = l2cap_chan_hold_unless_zero(c);
>  		read_unlock(&chan_list_lock);
>  		return c;
>  	}
Luiz Augusto von Dentz July 20, 2022, 5:10 p.m. UTC | #14
Hi Lee,

On Wed, Jul 20, 2022 at 4:52 AM Lee Jones <lee.jones@linaro.org> wrote:
>
> On Wed, 06 Jul 2022, Luiz Augusto von Dentz wrote:
> > > > > > > Perhaps something like this:
> > > > > >
> > > > > > I'm struggling to apply this for test:
> > > > > >
> > > > > >   "error: corrupt patch at line 6"
> > > > >
> > > > > Check with the attached patch.
> > > >
> > > > With the patch applied:
> > > >
> > > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > > > [  188.825418][   T75] refcount_t: addition on 0; use-after-free.
> > >
> > > Looks like the changes just make the issue more visible since we are
> > > trying to add a refcount when it is already 0 so this proves the
> > > design is not quite right since it is removing the object from the
> > > list only when destroying it while we probably need to do it before.
> > >
> > > How about we use kref_get_unless_zero as it appears it was introduced
> > > exactly for such cases (patch attached.)
> >
> > Looks like I missed a few places like l2cap_global_chan_by_psm so here
> > is another version.
>
> Okay, with the patch below the kernel doesn't produce a back-trace.

Great, Ive send a patch with these changes.

> Only this, which I assume is expected?
>
> [  535.398255][  T495] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
> [  535.398255][  T495] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
> [  535.417007][  T495] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
> [  535.417007][  T495] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
> [  535.434810][  T495] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
> [  535.434810][  T495] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
> [  535.452886][  T495] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
> [  535.452886][  T495] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
> [  535.470574][  T495] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3
> [  535.470574][  T495] Bluetooth: hci0: unexpected cc 0x0c25 length: 249 > 3
> [  535.488009][  T495] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
> [  535.488009][  T495] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
> [  537.551677][   T74] Bluetooth: hci0: command 0x0409 tx timeout
> [  537.551677][   T74] Bluetooth: hci0: command 0x0409 tx timeout
> [  539.641362][  T373] Bluetooth: hci0: command 0x041b tx timeout
> [  539.641362][  T373] Bluetooth: hci0: command 0x041b tx timeout
> [  541.711056][  T274] Bluetooth: hci0: command 0x040f tx timeout
> [  541.711056][  T274] Bluetooth: hci0: command 0x040f tx timeout
> [  543.790939][   T66] Bluetooth: hci0: command 0x0419 tx timeout
> [  543.790939][   T66] Bluetooth: hci0: command 0x0419 tx timeout

Yep, if the test doesn't act according to the BT spec, those are expected.

> > From 235937ac7a39d16e5dabbfca0ac1d58e4cc814d9 Mon Sep 17 00:00:00 2001
> > From: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> > Date: Tue, 28 Jun 2022 15:46:04 -0700
> > Subject: [PATCH] Bluetooth: L2CAP: WIP
> >
> > Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> > ---
> >  include/net/bluetooth/l2cap.h |  1 +
> >  net/bluetooth/l2cap_core.c    | 58 +++++++++++++++++++++++++++--------
> >  2 files changed, 46 insertions(+), 13 deletions(-)
> >
> > diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
> > index 3c4f550e5a8b..2f766e3437ce 100644
> > --- a/include/net/bluetooth/l2cap.h
> > +++ b/include/net/bluetooth/l2cap.h
> > @@ -847,6 +847,7 @@ enum {
> >  };
> >
> >  void l2cap_chan_hold(struct l2cap_chan *c);
> > +struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c);
> >  void l2cap_chan_put(struct l2cap_chan *c);
> >
> >  static inline void l2cap_chan_lock(struct l2cap_chan *chan)
> > diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
> > index 09ecaf556de5..3e5d81e971cc 100644
> > --- a/net/bluetooth/l2cap_core.c
> > +++ b/net/bluetooth/l2cap_core.c
> > @@ -111,7 +111,8 @@ static struct l2cap_chan *__l2cap_get_chan_by_scid(struct l2cap_conn *conn,
> >  }
> >
> >  /* Find channel with given SCID.
> > - * Returns locked channel. */
> > + * Returns a reference locked channel.
> > + */
> >  static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
> >                                                u16 cid)
> >  {
> > @@ -119,15 +120,19 @@ static struct l2cap_chan *l2cap_get_chan_by_scid(struct l2cap_conn *conn,
> >
> >       mutex_lock(&conn->chan_lock);
> >       c = __l2cap_get_chan_by_scid(conn, cid);
> > -     if (c)
> > -             l2cap_chan_lock(c);
> > +     if (c) {
> > +             /* Only lock if chan reference is not 0 */
> > +             c = l2cap_chan_hold_unless_zero(c);
> > +             if (c)
> > +                     l2cap_chan_lock(c);
> > +     }
> >       mutex_unlock(&conn->chan_lock);
> >
> >       return c;
> >  }
> >
> >  /* Find channel with given DCID.
> > - * Returns locked channel.
> > + * Returns a reference locked channel.
> >   */
> >  static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
> >                                                u16 cid)
> > @@ -136,8 +141,12 @@ static struct l2cap_chan *l2cap_get_chan_by_dcid(struct l2cap_conn *conn,
> >
> >       mutex_lock(&conn->chan_lock);
> >       c = __l2cap_get_chan_by_dcid(conn, cid);
> > -     if (c)
> > -             l2cap_chan_lock(c);
> > +     if (c) {
> > +             /* Only lock if chan reference is not 0 */
> > +             c = l2cap_chan_hold_unless_zero(c);
> > +             if (c)
> > +                     l2cap_chan_lock(c);
> > +     }
> >       mutex_unlock(&conn->chan_lock);
> >
> >       return c;
> > @@ -162,8 +171,12 @@ static struct l2cap_chan *l2cap_get_chan_by_ident(struct l2cap_conn *conn,
> >
> >       mutex_lock(&conn->chan_lock);
> >       c = __l2cap_get_chan_by_ident(conn, ident);
> > -     if (c)
> > -             l2cap_chan_lock(c);
> > +     if (c) {
> > +             /* Only lock if chan reference is not 0 */
> > +             c = l2cap_chan_hold_unless_zero(c);
> > +             if (c)
> > +                     l2cap_chan_lock(c);
> > +     }
> >       mutex_unlock(&conn->chan_lock);
> >
> >       return c;
> > @@ -497,6 +510,16 @@ void l2cap_chan_hold(struct l2cap_chan *c)
> >       kref_get(&c->kref);
> >  }
> >
> > +struct l2cap_chan *l2cap_chan_hold_unless_zero(struct l2cap_chan *c)
> > +{
> > +     BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > +
> > +     if (!kref_get_unless_zero(&c->kref))
> > +             return NULL;
> > +
> > +     return c;
> > +}
> > +
> >  void l2cap_chan_put(struct l2cap_chan *c)
> >  {
> >       BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
> > @@ -1969,7 +1992,7 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
> >                       src_match = !bacmp(&c->src, src);
> >                       dst_match = !bacmp(&c->dst, dst);
> >                       if (src_match && dst_match) {
> > -                             l2cap_chan_hold(c);
> > +                             c = l2cap_chan_hold_unless_zero(c);
> >                               read_unlock(&chan_list_lock);
> >                               return c;
> >                       }
> > @@ -1984,7 +2007,7 @@ static struct l2cap_chan *l2cap_global_chan_by_psm(int state, __le16 psm,
> >       }
> >
> >       if (c1)
> > -             l2cap_chan_hold(c1);
> > +             c1 = l2cap_chan_hold_unless_zero(c1);
> >
> >       read_unlock(&chan_list_lock);
> >
> > @@ -4464,6 +4487,7 @@ static inline int l2cap_config_req(struct l2cap_conn *conn,
> >
> >  unlock:
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >       return err;
> >  }
> >
> > @@ -4578,6 +4602,7 @@ static inline int l2cap_config_rsp(struct l2cap_conn *conn,
> >
> >  done:
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >       return err;
> >  }
> >
> > @@ -5305,6 +5330,7 @@ static inline int l2cap_move_channel_req(struct l2cap_conn *conn,
> >       l2cap_send_move_chan_rsp(chan, result);
> >
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >
> >       return 0;
> >  }
> > @@ -5397,6 +5423,7 @@ static void l2cap_move_continue(struct l2cap_conn *conn, u16 icid, u16 result)
> >       }
> >
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >  }
> >
> >  static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
> > @@ -5426,6 +5453,7 @@ static void l2cap_move_fail(struct l2cap_conn *conn, u8 ident, u16 icid,
> >       l2cap_send_move_chan_cfm(chan, L2CAP_MC_UNCONFIRMED);
> >
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >  }
> >
> >  static int l2cap_move_channel_rsp(struct l2cap_conn *conn,
> > @@ -5489,6 +5517,7 @@ static int l2cap_move_channel_confirm(struct l2cap_conn *conn,
> >       l2cap_send_move_chan_cfm_rsp(conn, cmd->ident, icid);
> >
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >
> >       return 0;
> >  }
> > @@ -5524,6 +5553,7 @@ static inline int l2cap_move_channel_confirm_rsp(struct l2cap_conn *conn,
> >       }
> >
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >
> >       return 0;
> >  }
> > @@ -5896,12 +5926,11 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
> >       if (credits > max_credits) {
> >               BT_ERR("LE credits overflow");
> >               l2cap_send_disconn_req(chan, ECONNRESET);
> > -             l2cap_chan_unlock(chan);
> >
> >               /* Return 0 so that we don't trigger an unnecessary
> >                * command reject packet.
> >                */
> > -             return 0;
> > +             goto unlock;
> >       }
> >
> >       chan->tx_credits += credits;
> > @@ -5912,7 +5941,9 @@ static inline int l2cap_le_credits(struct l2cap_conn *conn,
> >       if (chan->tx_credits)
> >               chan->ops->resume(chan);
> >
> > +unlock:
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >
> >       return 0;
> >  }
> > @@ -7598,6 +7629,7 @@ static void l2cap_data_channel(struct l2cap_conn *conn, u16 cid,
> >
> >  done:
> >       l2cap_chan_unlock(chan);
> > +     l2cap_chan_put(chan);
> >  }
> >
> >  static void l2cap_conless_channel(struct l2cap_conn *conn, __le16 psm,
> > @@ -8086,7 +8118,7 @@ static struct l2cap_chan *l2cap_global_fixed_chan(struct l2cap_chan *c,
> >               if (src_type != c->src_type)
> >                       continue;
> >
> > -             l2cap_chan_hold(c);
> > +             c = l2cap_chan_hold_unless_zero(c);
> >               read_unlock(&chan_list_lock);
> >               return c;
> >       }
>
>
> --
> Lee Jones [李琼斯]
> Principal Technical Lead - Developer Services
> Linaro.org │ Open source software for Arm SoCs
> Follow Linaro: Facebook | Twitter | Blog
diff mbox series

Patch

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index ae78490ecd3d4..82279c5919fd8 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -483,9 +483,7 @@  static void l2cap_chan_destroy(struct kref *kref)
 
 	BT_DBG("chan %p", chan);
 
-	write_lock(&chan_list_lock);
 	list_del(&chan->global_l);
-	write_unlock(&chan_list_lock);
 
 	kfree(chan);
 }
@@ -501,7 +499,9 @@  void l2cap_chan_put(struct l2cap_chan *c)
 {
 	BT_DBG("chan %p orig refcnt %u", c, kref_read(&c->kref));
 
+	write_lock(&chan_list_lock);
 	kref_put(&c->kref, l2cap_chan_destroy);
+	write_unlock(&chan_list_lock);
 }
 EXPORT_SYMBOL_GPL(l2cap_chan_put);