diff mbox series

[net,v3,2/2] skmsg: bugfix for sk_msg sge iteration

Message ID 56d8ec28df901432e7bde4953795166ce2edd472.1719553101.git.tanggeliang@kylinos.cn
State New
Headers show
Series bugfixes for skmsg | expand

Commit Message

Geliang Tang June 28, 2024, 5:47 a.m. UTC
From: Geliang Tang <tanggeliang@kylinos.cn>

Every time run this BPF selftests (./test_sockmap) on a Loongarch platform,
a Kernel panic occurs:

'''
 Oops[#1]:
 CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G     OE 6.10.0-rc2+ #32
 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
 ... ...
    ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420
   ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220
  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
  PRMD: 0000000c (PPLV0 +PIE +PWE)
  EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
  ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
 ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
  BADV: 0000000000000040
  PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
 Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
 Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=...)
 Stack : ... ...
         ...
 Call Trace:
 [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220
 [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420
 [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0
 [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360
 [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120
 [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100
 [<90000000045b76ec>] do_syscall+0x8c/0xc0
 [<90000000030e1da4>] handle_syscall+0xc4/0x160

 Code: ...

 ---[ end trace 0000000000000000 ]---
'''

This crash is because a NULL pointer is passed to page_address() in
sk_msg_memcopy_from_iter(). Due to the difference in architecture,
page_address(0) will not trigger a panic on the X86 platform but will panic
on the Loogarch platform. So this bug was hidden on the x86 platform, but
now it is exposed on the Loogarch platform.

This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an invalid
"sge" is always used:

	if (msg->sg.copybreak >= sge->length) {
		msg->sg.copybreak = 0;
		sk_msg_iter_var_next(i);
		if (i == msg->sg.end)
			break;
		sge = sk_msg_elem(msg, i);
	}

If the value of i is 2, msg->sg.end is also 2 when entering this if block.
sk_msg_iter_var_next() increases i by 1, and now i is 3, which is no longer
equal to msg->sg.end. The break will not be triggered, and the next sge
obtained by sk_msg_elem(3) will be an invalid one.

The correct approach is to check (i == msg->sg.end) first, and then invoke
sk_msg_iter_var_next() if they are not equal.

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
---
 net/core/skmsg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

D. Wythe July 1, 2024, 9 a.m. UTC | #1
On 6/28/24 1:47 PM, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> Every time run this BPF selftests (./test_sockmap) on a Loongarch platform,
> a Kernel panic occurs:
>
> '''
>   Oops[#1]:
>   CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G     OE 6.10.0-rc2+ #32
>   Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018
>   ... ...
>      ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420
>     ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220
>    CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
>    PRMD: 0000000c (PPLV0 +PIE +PWE)
>    EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
>    ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
>   ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
>    BADV: 0000000000000040
>    PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
>   Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
>   Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043, task=...)
>   Stack : ... ...
>           ...
>   Call Trace:
>   [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220
>   [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420
>   [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0
>   [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360
>   [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120
>   [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100
>   [<90000000045b76ec>] do_syscall+0x8c/0xc0
>   [<90000000030e1da4>] handle_syscall+0xc4/0x160
>
>   Code: ...
>
>   ---[ end trace 0000000000000000 ]---
> '''
>
> This crash is because a NULL pointer is passed to page_address() in
> sk_msg_memcopy_from_iter(). Due to the difference in architecture,
> page_address(0) will not trigger a panic on the X86 platform but will panic
> on the Loogarch platform. So this bug was hidden on the x86 platform, but
> now it is exposed on the Loogarch platform.
>
> This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an invalid
> "sge" is always used:
>
> 	if (msg->sg.copybreak >= sge->length) {
> 		msg->sg.copybreak = 0;
> 		sk_msg_iter_var_next(i);
> 		if (i == msg->sg.end)
> 			break;
> 		sge = sk_msg_elem(msg, i);
> 	}
>
> If the value of i is 2, msg->sg.end is also 2 when entering this if block.
> sk_msg_iter_var_next() increases i by 1, and now i is 3, which is no longer
> equal to msg->sg.end. The break will not be triggered, and the next sge
> obtained by sk_msg_elem(3) will be an invalid one.
>
> The correct approach is to check (i == msg->sg.end) first, and then invoke
> sk_msg_iter_var_next() if they are not equal.
>
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> ---
>   net/core/skmsg.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index 44952cdd1425..1906d0d0eeac 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -378,9 +378,9 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
>   		/* This is possible if a trim operation shrunk the buffer */
>   		if (msg->sg.copybreak >= sge->length) {
>   			msg->sg.copybreak = 0;
> -			sk_msg_iter_var_next(i);
>   			if (i == msg->sg.end)
>   				break;
> +			sk_msg_iter_var_next(i);
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
>   			sge = sk_msg_elem(msg, i);
>   		}
>
Geliang Tang July 1, 2024, 10:29 a.m. UTC | #2
Hello,

On Mon, 2024-07-01 at 17:00 +0800, D. Wythe wrote:
> 
> 
> On 6/28/24 1:47 PM, Geliang Tang wrote:
> > From: Geliang Tang <tanggeliang@kylinos.cn>
> > 
> > Every time run this BPF selftests (./test_sockmap) on a Loongarch
> > platform,
> > a Kernel panic occurs:
> > 
> > '''
> >   Oops[#1]:
> >   CPU: 20 PID: 23245 Comm: test_sockmap Tainted: G     OE 6.10.0-
> > rc2+ #32
> >   Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS
> > Loongson-UDK2018
> >   ... ...
> >      ra: 90000000043a315c tcp_bpf_sendmsg+0x23c/0x420
> >     ERA: 900000000426cd1c sk_msg_memcopy_from_iter+0xbc/0x220
> >    CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
> >    PRMD: 0000000c (PPLV0 +PIE +PWE)
> >    EUEN: 00000007 (+FPE +SXE +ASXE -BTE)
> >    ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7)
> >   ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
> >    BADV: 0000000000000040
> >    PRID: 0014c011 (Loongson-64bit, Loongson-3C5000)
> >   Modules linked in: tls xt_CHECKSUM xt_MASQUERADE xt_conntrack
> > ipt_REJECT
> >   Process test_sockmap (pid: 23245, threadinfo=00000000aeb68043,
> > task=...)
> >   Stack : ... ...
> >           ...
> >   Call Trace:
> >   [<900000000426cd1c>] sk_msg_memcopy_from_iter+0xbc/0x220
> >   [<90000000043a315c>] tcp_bpf_sendmsg+0x23c/0x420
> >   [<90000000041cafc8>] __sock_sendmsg+0x68/0xe0
> >   [<90000000041cc4bc>] ____sys_sendmsg+0x2bc/0x360
> >   [<90000000041cea18>] ___sys_sendmsg+0xb8/0x120
> >   [<90000000041cf1f8>] __sys_sendmsg+0x98/0x100
> >   [<90000000045b76ec>] do_syscall+0x8c/0xc0
> >   [<90000000030e1da4>] handle_syscall+0xc4/0x160
> > 
> >   Code: ...
> > 
> >   ---[ end trace 0000000000000000 ]---
> > '''
> > 
> > This crash is because a NULL pointer is passed to page_address() in
> > sk_msg_memcopy_from_iter(). Due to the difference in architecture,
> > page_address(0) will not trigger a panic on the X86 platform but
> > will panic
> > on the Loogarch platform. So this bug was hidden on the x86
> > platform, but
> > now it is exposed on the Loogarch platform.
> > 
> > This bug is a logic error indeed. In sk_msg_memcopy_from_iter(), an
> > invalid
> > "sge" is always used:
> > 
> > 	if (msg->sg.copybreak >= sge->length) {
> > 		msg->sg.copybreak = 0;
> > 		sk_msg_iter_var_next(i);
> > 		if (i == msg->sg.end)
> > 			break;
> > 		sge = sk_msg_elem(msg, i);
> > 	}
> > 
> > If the value of i is 2, msg->sg.end is also 2 when entering this if
> > block.
> > sk_msg_iter_var_next() increases i by 1, and now i is 3, which is
> > no longer
> > equal to msg->sg.end. The break will not be triggered, and the next
> > sge
> > obtained by sk_msg_elem(3) will be an invalid one.
> > 
> > The correct approach is to check (i == msg->sg.end) first, and then
> > invoke
> > sk_msg_iter_var_next() if they are not equal.
> > 
> > Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg
> > interface")
> > Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> > ---
> >   net/core/skmsg.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> > index 44952cdd1425..1906d0d0eeac 100644
> > --- a/net/core/skmsg.c
> > +++ b/net/core/skmsg.c
> > @@ -378,9 +378,9 @@ int sk_msg_memcopy_from_iter(struct sock *sk,
> > struct iov_iter *from,
> >   		/* This is possible if a trim operation shrunk the
> > buffer */
> >   		if (msg->sg.copybreak >= sge->length) {
> >   			msg->sg.copybreak = 0;
> > -			sk_msg_iter_var_next(i);
> >   			if (i == msg->sg.end)
> >   				break;
> > +			sk_msg_iter_var_next(i);
> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>

Thanks for your review.

But this change breaks test_sockmap. Will send a v4 to fix this.

Changes Requested.

-Geliang

> >   			sge = sk_msg_elem(msg, i);
> >   		}
> >   
> 
> 
>
diff mbox series

Patch

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 44952cdd1425..1906d0d0eeac 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -378,9 +378,9 @@  int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
 		/* This is possible if a trim operation shrunk the buffer */
 		if (msg->sg.copybreak >= sge->length) {
 			msg->sg.copybreak = 0;
-			sk_msg_iter_var_next(i);
 			if (i == msg->sg.end)
 				break;
+			sk_msg_iter_var_next(i);
 			sge = sk_msg_elem(msg, i);
 		}