[net-next,v3,0/8] Perf. optimizations for TCP Recv. Zerocopy

Message ID	20201202225349.935284-1-arjunroy.kdev@gmail.com
Headers	show Return-Path: <netdev-owner@kernel.org> From: Arjun Roy <arjunroy.kdev@gmail.com> To: davem@davemloft.net, netdev@vger.kernel.org Cc: arjunroy@google.com, edumazet@google.com, soheil@google.com Subject: [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy Date: Wed, 2 Dec 2020 14:53:41 -0800 Message-Id: <20201202225349.935284-1-arjunroy.kdev@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Perf. optimizations for TCP Recv. Zerocopy \| expand [net-next,v3,0/8] Perf. optimizations for TCP Recv. Zerocopy [net-next,v3,1/8] net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy. [net-next,v3,2/8] net-tcp: Introduce tcp_recvmsg_locked(). [net-next,v3,3/8] net-zerocopy: Refactor skb frag fast-forward op. [net-next,v3,4/8] net-zerocopy: Refactor frag-is-remappable test. [net-next,v3,5/8] net-zerocopy: Fast return if inq < PAGE_SIZE [net-next,v3,6/8] net-zerocopy: Introduce short-circuit small reads. [net-next,v3,7/8] net-zerocopy: Set zerocopy hint when data is copied [net-next,v3,8/8] net-zerocopy: Defer vm zap unless actually needed.

Message ID

20201202225349.935284-1-arjunroy.kdev@gmail.com

Headers

From: Arjun Roy <arjunroy.kdev@gmail.com>
To: davem@davemloft.net, netdev@vger.kernel.org
Cc: arjunroy@google.com, edumazet@google.com, soheil@google.com
Subject: [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy
Date: Wed,  2 Dec 2020 14:53:41 -0800
Message-Id: <20201202225349.935284-1-arjunroy.kdev@gmail.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

Perf. optimizations for TCP Recv. Zerocopy | expand

Message

Arjun Roy Dec. 2, 2020, 10:53 p.m. UTC

From: Arjun Roy <arjunroy@google.com>

This patchset contains several optimizations for TCP Recv. Zerocopy.

v3: Fixes 32-bit compilation, stylistic issues and re-adds signoffs.

Summarized:
1. It is possible that a read payload is not exactly page aligned -
that there may exist "straggler" bytes that we cannot map into the
caller's address space cleanly. For this, we allow the caller to
provide as argument a "hybrid copy buffer", turning
getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows
the caller to avoid a subsequent recvmsg() call to read the
stragglers.

2. Similarly, for "small" read payloads that are either below the size
of a page, or small enough that remapping pages is not a performance
win - we allow the user to short-circuit the remapping operations
entirely and simply copy into the buffer provided.

Some of the patches in the middle of this set are refactors to support
this "short-circuiting" optimization.

3. We allow the user to provide a hint that performing a page zap
operation (and the accompanying TLB shootdown) may not be necessary,
for the provided region that the kernel will attempt to map pages
into. This allows us to avoid this expensive operation while holding
the socket lock, which provides a significant performance advantage.

With all of these changes combined, "medium" sized receive traffic
(multiple tens to few hundreds of KB) see significant efficiency gains
when using TCP receive zerocopy instead of regular recvmsg(). For
example, with RPC-style traffic with 32KB messages, there is a roughly
15% efficiency improvement when using zerocopy. Without these changes,
there is a roughly 60-70% efficiency reduction with such messages when
employing zerocopy.

Arjun Roy (8):
  net-zerocopy: Copy straggler unaligned data for TCP Rx. zerocopy.
  net-tcp: Introduce tcp_recvmsg_locked().
  net-zerocopy: Refactor skb frag fast-forward op.
  net-zerocopy: Refactor frag-is-remappable test.
  net-zerocopy: Fast return if inq < PAGE_SIZE
  net-zerocopy: Introduce short-circuit small reads.
  net-zerocopy: Set zerocopy hint when data is copied
  net-zerocopy: Defer vm zap unless actually needed.

 include/uapi/linux/tcp.h |   4 +
 net/ipv4/tcp.c           | 446 +++++++++++++++++++++++++++++----------
 2 files changed, 343 insertions(+), 107 deletions(-)

Comments

Jakub Kicinski Dec. 4, 2020, 10:38 p.m. UTC | #1

On Wed,  2 Dec 2020 14:53:41 -0800 Arjun Roy wrote:
> Summarized:

> 1. It is possible that a read payload is not exactly page aligned -

> that there may exist "straggler" bytes that we cannot map into the

> caller's address space cleanly. For this, we allow the caller to

> provide as argument a "hybrid copy buffer", turning

> getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows

> the caller to avoid a subsequent recvmsg() call to read the

> stragglers.

> 

> 2. Similarly, for "small" read payloads that are either below the size

> of a page, or small enough that remapping pages is not a performance

> win - we allow the user to short-circuit the remapping operations

> entirely and simply copy into the buffer provided.

> 

> Some of the patches in the middle of this set are refactors to support

> this "short-circuiting" optimization.

> 

> 3. We allow the user to provide a hint that performing a page zap

> operation (and the accompanying TLB shootdown) may not be necessary,

> for the provided region that the kernel will attempt to map pages

> into. This allows us to avoid this expensive operation while holding

> the socket lock, which provides a significant performance advantage.

> 

> With all of these changes combined, "medium" sized receive traffic

> (multiple tens to few hundreds of KB) see significant efficiency gains

> when using TCP receive zerocopy instead of regular recvmsg(). For

> example, with RPC-style traffic with 32KB messages, there is a roughly

> 15% efficiency improvement when using zerocopy. Without these changes,

> there is a roughly 60-70% efficiency reduction with such messages when

> employing zerocopy.


Applied, thank you!