Message ID | 20210212232214.2869897-2-eric.dumazet@gmail.com |
---|---|
State | New |
Headers | show |
Series | tcp: mem pressure vs SO_RCVLOWAT | expand |
On Fri, Feb 12, 2021 at 3:22 PM Eric Dumazet <eric.dumazet@gmail.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > > While commit 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs") > fixed an issue vs too small sk_rcvbuf for given sk_rcvlowat constraint, > it missed to address issue caused by memory pressure. > > 1) If we are under memory pressure and socket receive queue is empty. > First incoming packet is allowed to be queued, after commit > 76dfa6082032 ("tcp: allow one skb to be received per socket under memory pressure") > > But we do not send EPOLLIN yet, in case tcp_data_ready() sees sk_rcvlowat > is bigger than skb length. > > 2) Then, when next packet comes, it is dropped, and we directly > call sk->sk_data_ready(). > > 3) If application is using poll(), tcp_poll() will then use > tcp_stream_is_readable() and decide the socket receive queue is > not yet filled, so nothing will happen. > > Even when sender retransmits packets, phases 2) & 3) repeat > and flow is effectively frozen, until memory pressure is off. > > Fix is to consider tcp_under_memory_pressure() to take care > of global memory pressure or memcg pressure. > > Fixes: 24adbc1676af ("tcp: fix SO_RCVLOWAT hangs with fat skbs") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Arjun Roy <arjunroy@google.com> > Suggested-by: Wei Wang <weiwan@google.com> > --- Nice description in the commit msg! Reviewed-by: Wei Wang <weiwan@google.com> > include/net/tcp.h | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/include/net/tcp.h b/include/net/tcp.h > index 25bbada379c46add16fb7239733bd6571f10f680..244208f6f6c2ace87920b633e469421f557427a6 100644 > --- a/include/net/tcp.h > +++ b/include/net/tcp.h > @@ -1431,8 +1431,13 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied); > */ > static inline bool tcp_rmem_pressure(const struct sock *sk) > { > - int rcvbuf = READ_ONCE(sk->sk_rcvbuf); > - int threshold = rcvbuf - (rcvbuf >> 3); > + int rcvbuf, threshold; > + > + if (tcp_under_memory_pressure(sk)) > + return true; > + > + rcvbuf = READ_ONCE(sk->sk_rcvbuf); > + threshold = rcvbuf - (rcvbuf >> 3); > > return atomic_read(&sk->sk_rmem_alloc) > threshold; > } > -- > 2.30.0.478.g8a0d178c01-goog >
diff --git a/include/net/tcp.h b/include/net/tcp.h index 25bbada379c46add16fb7239733bd6571f10f680..244208f6f6c2ace87920b633e469421f557427a6 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1431,8 +1431,13 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied); */ static inline bool tcp_rmem_pressure(const struct sock *sk) { - int rcvbuf = READ_ONCE(sk->sk_rcvbuf); - int threshold = rcvbuf - (rcvbuf >> 3); + int rcvbuf, threshold; + + if (tcp_under_memory_pressure(sk)) + return true; + + rcvbuf = READ_ONCE(sk->sk_rcvbuf); + threshold = rcvbuf - (rcvbuf >> 3); return atomic_read(&sk->sk_rmem_alloc) > threshold; }