diff mbox series

Help needed in patching CVE-2021-3640

Message ID 15f5a46.b79d9.17ba6802ccd.Coremail.linma@zju.edu.cn
State New
Headers show
Series Help needed in patching CVE-2021-3640 | expand

Commit Message

Lin Ma Sept. 2, 2021, 12:33 p.m. UTC
Hello there,

There is one bug (CVE-2021-3640: https://www.openwall.com/lists/oss-security/2021/07/22/1) that is similar to the recently fixed CVE-2021-3573.

The key point here is that the sco_conn_del() function can be called when syscalls like sco_sendmsg() is undergoing.
I think the easiest fix is to hang the sco_conn_del() using lock_sock() like below.


This can make sure the kfree() will wait for the sock held by the sco_sendmsg() function. However, this patch can incur WARNING report like below. (I don't really know if this report is correct).

[   75.147515] ======================================================
[   75.149955] WARNING: possible circular locking dependency detected
[   75.150546] 5.11.11+ #58 Not tainted
[   75.150895] ------------------------------------------------------
[   75.151485] poc.sco/127 is trying to acquire lock:
[   75.151947] ffff888012212120 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at: sco_conn_del+0xf6/0x0
[   75.152863]
[   75.152863] but task is already holding lock:
[   75.153420] ffffffff85b43948 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xb3/0x1f0
[   75.154256]
[   75.154256] which lock already depends on the new lock.

P.S. find the POC code in openwall report

With the lesson I learnt in last bad patch e305509e678b ("Bluetooth: use correct lock to prevent UAF of hdev object"). I don't really expect this as the final correct patch. 

I then try to use the technique in e04480920d1e ("Bluetooth: defer cleanup of resources in hci_unregister_dev()"). I mean, I want to defer the kfree of sco_conn object. However, the sco connection/disconnection mechanism is somewhat weird and I didn't really understand it by now.

Let's see this __sco_sock_close() function, which will be called from sco_sock_release().

static void __sco_sock_close(struct sock *sk)
{
	BT_DBG("sk %p state %d socket %p", sk, sk->sk_state, sk->sk_socket);

	switch (sk->sk_state) {
	case BT_LISTEN:
		sco_sock_cleanup_listen(sk);
		break;

	case BT_CONNECTED:
	case BT_CONFIG:
		if (sco_pi(sk)->conn->hcon) {
			sk->sk_state = BT_DISCONN;
			sco_sock_set_timer(sk, SCO_DISCONN_TIMEOUT);
			sco_conn_lock(sco_pi(sk)->conn);
			hci_conn_drop(sco_pi(sk)->conn->hcon);
			sco_pi(sk)->conn->hcon = NULL;
			sco_conn_unlock(sco_pi(sk)->conn);
		} else
			sco_chan_del(sk, ECONNRESET);
		break;

	case BT_CONNECT2:
	case BT_CONNECT:
	case BT_DISCONN:
		sco_chan_del(sk, ECONNRESET);
		break;

	default:
		sock_set_flag(sk, SOCK_ZAPPED);
		break;
	}
}

As you can see, though one socket is in BT_CONNECTED state, this function will just drop the kref of sco_pi(sk)->conn->hcon but do nothing with sco_pi(sk)->conn object. Then how this conn object is released? Where should I defer the deallocation function to?

I think I need help and discussion to settle down the solution for this. T_T

Best Wishes
Lin Ma

Comments

Tetsuo Handa Sept. 2, 2021, 3:32 p.m. UTC | #1
On 2021/09/02 21:33, LinMa wrote:
> Hello there,
> 
> There is one bug (CVE-2021-3640: https://www.openwall.com/lists/oss-security/2021/07/22/1) that is similar to the recently fixed CVE-2021-3573.
> 
> The key point here is that the sco_conn_del() function can be called when syscalls like sco_sendmsg() is undergoing.

Since hdev->lock is held when sco_conn_del() is called,

 3 locks held by poc/6686:
  #0: ffff8880158690e0 (&hdev->req_lock){+.+.}-{3:3}, at: hci_dev_do_close+0x44/0x6a0 [bluetooth]
  #1: ffff888015868080 (&hdev->lock){+.+.}-{3:3}, at: hci_dev_do_close+0x1ac/0x6a0 [bluetooth]
  #2: ffffffffa0630030 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0x6f/0x140 [bluetooth]

I guess that holding hdev->lock when sco_send_frame() is called would avoid use-after-free.

diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index d9a4e88dacbb..f5339bfba4a5 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -727,10 +727,17 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	lock_sock(sk);
 
-	if (sk->sk_state == BT_CONNECTED)
-		err = sco_send_frame(sk, msg, len);
-	else
-		err = -ENOTCONN;
+	err = -ENOTCONN;
+	if (sk->sk_state == BT_CONNECTED) {
+		struct hci_dev *hdev = hci_get_route(&sco_pi(sk)->dst, &sco_pi(sk)->src, BDADDR_BREDR);
+
+		if (hdev) {
+			hci_dev_lock(hdev);
+			err = sco_send_frame(sk, msg, len);
+			hci_dev_unlock(hdev);
+			hci_dev_put(hdev);
+		}
+	}
 
 	release_sock(sk);
 	return err;

But I'm not happy with calling hci_get_route() every time.
Can we cache the hdev found upon sco_connect() ?
Luiz Augusto von Dentz Sept. 3, 2021, 3:48 a.m. UTC | #2
Hi Tetsuo,

On Thu, Sep 2, 2021 at 7:44 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>

> Since userfaultfd mechanism allows sleeping with kernel lock held,

> avoiding page fault with kernel lock held where possible will make

> the module more robust. This patch just brings memcpy_from_msg() calls

> to out of sock lock.

>

> This patch is an instant mitigation for CVE-2021-3640. To fully close

> the race window for this use-after-free problem, we need more changes.

>

> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

> ---

>  net/bluetooth/sco.c | 21 ++++++++++++++-------

>  1 file changed, 14 insertions(+), 7 deletions(-)

>

> diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c

> index d9a4e88dacbb..e4b079b31ce9 100644

> --- a/net/bluetooth/sco.c

> +++ b/net/bluetooth/sco.c

> @@ -273,7 +273,7 @@ static int sco_connect(struct sock *sk)

>         return err;

>  }

>

> -static int sco_send_frame(struct sock *sk, struct msghdr *msg, int len)

> +static int sco_send_frame(struct sock *sk, const void *buf, int len, int flags)

>  {

>         struct sco_conn *conn = sco_pi(sk)->conn;

>         struct sk_buff *skb;

> @@ -285,14 +285,11 @@ static int sco_send_frame(struct sock *sk, struct msghdr *msg, int len)

>

>         BT_DBG("sk %p len %d", sk, len);

>

> -       skb = bt_skb_send_alloc(sk, len, msg->msg_flags & MSG_DONTWAIT, &err);

> +       skb = bt_skb_send_alloc(sk, len, flags & MSG_DONTWAIT, &err);

>         if (!skb)

>                 return err;

>

> -       if (memcpy_from_msg(skb_put(skb, len), msg, len)) {

> -               kfree_skb(skb);

> -               return -EFAULT;

> -       }

> +       memcpy(skb_put(skb, len), buf, len);

>

>         hci_send_sco(conn->hcon, skb);

>

> @@ -714,6 +711,7 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,

>                             size_t len)

>  {

>         struct sock *sk = sock->sk;

> +       void *buf;

>         int err;

>

>         BT_DBG("sock %p, sk %p", sock, sk);

> @@ -725,14 +723,23 @@ static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,

>         if (msg->msg_flags & MSG_OOB)

>                 return -EOPNOTSUPP;

>

> +       buf = kmalloc(len, GFP_KERNEL | __GFP_NOWARN);

> +       if (!buf)

> +               return -ENOMEM;

> +       if (memcpy_from_msg(buf, msg, len)) {

> +               kfree(buf);

> +               return -EFAULT;

> +       }


There is a set already handing this sort of problem:

https://patchwork.kernel.org/project/bluetooth/patch/20210901002621.414016-3-luiz.dentz@gmail.com/

>         lock_sock(sk);

>

>         if (sk->sk_state == BT_CONNECTED)

> -               err = sco_send_frame(sk, msg, len);

> +               err = sco_send_frame(sk, buf, len, msg->msg_flags);

>         else

>                 err = -ENOTCONN;

>

>         release_sock(sk);

> +       kfree(buf);

>         return err;

>  }

>

> --

> 2.30.2

>

>



-- 
Luiz Augusto von Dentz
Tetsuo Handa Sept. 3, 2021, 4:40 a.m. UTC | #3
On 2021/09/03 12:48, Luiz Augusto von Dentz wrote:
> There is a set already handing this sort of problem:

> 

> https://patchwork.kernel.org/project/bluetooth/patch/20210901002621.414016-3-luiz.dentz@gmail.com/


OK, I didn't know that. (I'm not subscribed to bluethooth ML.)

But can we please keep the fix minimal? Multiple distributors are
waiting for the fix (which can be backported) for more than one month.

  https://security-tracker.debian.org/tracker/CVE-2021-3640
  https://access.redhat.com/security/cve/cve-2021-3640

And it looks to me that your
"[3/4] Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg"
contains a new use-after-free or memory corruption bug...   :-(
Tetsuo Handa Sept. 4, 2021, 2:02 a.m. UTC | #4
Commit 99c23da0eed4fd20 ("Bluetooth: sco: Fix lock_sock() blockage by memcpy_from_msg()") in linux-next.git should be sent to linux.git now as a mitigation for CVE-2021-3640.

But I think "[PATCH v3 3/4] Bluetooth: SCO: Replace use of memcpy_from_msg with bt_skb_sendmsg" still contains bug.
Salvatore Bonaccorso Oct. 11, 2021, 7 a.m. UTC | #5
Hi,

On Sat, Sep 04, 2021 at 11:02:58AM +0900, Tetsuo Handa wrote:
> Commit 99c23da0eed4fd20 ("Bluetooth: sco: Fix lock_sock() blockage

> by memcpy_from_msg()") in linux-next.git should be sent to linux.git

> now as a mitigation for CVE-2021-3640.

> 

> But I think "[PATCH v3 3/4] Bluetooth: SCO: Replace use of

> memcpy_from_msg with bt_skb_sendmsg" still contains bug.


Did his one felt through the cracks? I'm confused about the statement
in https://bugzilla.suse.com/show_bug.cgi?id=1188172#c8 so Cc'ing
Takashi Iwai as well.

Regards,
Salvatore
Takashi Iwai Oct. 11, 2021, 7:13 a.m. UTC | #6
On Mon, 11 Oct 2021 09:00:00 +0200,
Salvatore Bonaccorso wrote:
> 

> Hi,

> 

> On Sat, Sep 04, 2021 at 11:02:58AM +0900, Tetsuo Handa wrote:

> > Commit 99c23da0eed4fd20 ("Bluetooth: sco: Fix lock_sock() blockage

> > by memcpy_from_msg()") in linux-next.git should be sent to linux.git

> > now as a mitigation for CVE-2021-3640.

> > 

> > But I think "[PATCH v3 3/4] Bluetooth: SCO: Replace use of

> > memcpy_from_msg with bt_skb_sendmsg" still contains bug.

> 

> Did his one felt through the cracks? I'm confused about the statement

> in https://bugzilla.suse.com/show_bug.cgi?id=1188172#c8 so Cc'ing

> Takashi Iwai as well.


The quite similar fix has been already in the subsystem tree,
commit 99c23da0eed4 ("Bluetooth: sco: Fix lock_sock() blockage by
memcpy_from_msg()").  The particular CVE should be covered by that and
prerequisite patches.


Takashi
diff mbox series

Patch

diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index d9a4e88dacbb..3da1ad441463 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -173,10 +173,10 @@  static void sco_conn_del(struct hci_conn *hcon, int err)

        if (sk) {
                sock_hold(sk);
-               bh_lock_sock(sk);
+               lock_sock(sk);
                sco_sock_clear_timer(sk);
                sco_chan_del(sk, err);
-               bh_unlock_sock(sk);
+               release_sock(sk);
                sco_sock_kill(sk);
                sock_put(sk);
        }