Message ID | 161313782625.1008639.6000589679659428869.stgit@firesoul |
---|---|
State | New |
Headers | show |
Series | [net-next,V1] net: followup adjust net_device layout for cacheline usage | expand |
On 2/12/21 2:50 PM, Jesper Dangaard Brouer wrote: > As Eric pointed out in response to commit 28af22c6c8df ("net: adjust > net_device layout for cacheline usage") the netdev_features_t members > wanted_features and hw_features are only used in control path. > > Thus, this patch reorder the netdev_features_t to let more members that > are used in fast path into the 3rd cacheline. Whether these members are > read depend on SKB properties, which are hinted as comments. The member > mpls_features could not fit in the cacheline, but it was the least > commonly used (depend on CONFIG_NET_MPLS_GSO). > > In the future we should consider relocating member gso_partial_features > to be closer to member gso_max_segs. (see usage in gso_features_check()). > > Suggested-by: Eric Dumazet <edumazet@google.com> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > --- > include/linux/netdevice.h | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index bfadf3b82f9c..3898bb167579 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -1890,13 +1890,16 @@ struct net_device { > unsigned short needed_headroom; > unsigned short needed_tailroom; > > + /* Fast path features - via netif_skb_features */ > netdev_features_t features; > + netdev_features_t vlan_features; /* if skb_vlan_tagged */ > + netdev_features_t hw_enc_features; /* if skb->encapsulation */ > + netdev_features_t gso_partial_features;/* if skb_is_gso */ > + netdev_features_t mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */ > + > + /* Control path features */ > netdev_features_t hw_features; > netdev_features_t wanted_features; > - netdev_features_t vlan_features; > - netdev_features_t hw_enc_features; > - netdev_features_t mpls_features; > - netdev_features_t gso_partial_features; > > unsigned int min_mtu; > unsigned int max_mtu; > > Please also note we currently have at least 3 distinct blocks for tx path. Presumably netdev_features_t are only used in TX, so should be grouped with the other TX sections. /* --- cacheline 3 boundary (192 bytes) --- */ ... netdev_features_t features; /* 0xe0 0x8 */ ... Lots of ctrl stuff.... /* --- cacheline 14 boundary (896 bytes) --- */ struct netdev_queue * _tx __attribute__((__aligned__(64))); /* 0x380 0x8 */ .... /* Mix of unrelated control stuff like rtnl_link_ops /* --- cacheline 31 boundary (1984 bytes) --- */ unsigned int gso_max_size; /* 0x7c0 0x4 */ u16 gso_max_segs; /* 0x7c4 0x2 */ Ideally we should move _all_ control/slow_path stuff at the very end of the structure, in order to not pollute the cache lines we need for data path, to keep them as small and packed as possible. This could be done one field at a time, to ease code review. We should have something like this /* section used in RX (fast) path */ /* section used in both RX/TX (fast) path */ /* section used in TX (fast) path */ /* section used for slow path, and control path */
On Fri, 12 Feb 2021 18:03:44 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > On 2/12/21 2:50 PM, Jesper Dangaard Brouer wrote: > > As Eric pointed out in response to commit 28af22c6c8df ("net: adjust > > net_device layout for cacheline usage") the netdev_features_t members > > wanted_features and hw_features are only used in control path. > > > > Thus, this patch reorder the netdev_features_t to let more members that > > are used in fast path into the 3rd cacheline. Whether these members are > > read depend on SKB properties, which are hinted as comments. The member > > mpls_features could not fit in the cacheline, but it was the least > > commonly used (depend on CONFIG_NET_MPLS_GSO). > > > > In the future we should consider relocating member gso_partial_features > > to be closer to member gso_max_segs. (see usage in gso_features_check()). > > > > Suggested-by: Eric Dumazet <edumazet@google.com> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> > > --- > > include/linux/netdevice.h | 11 +++++++---- > > 1 file changed, 7 insertions(+), 4 deletions(-) > > > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > > index bfadf3b82f9c..3898bb167579 100644 > > --- a/include/linux/netdevice.h > > +++ b/include/linux/netdevice.h > > @@ -1890,13 +1890,16 @@ struct net_device { > > unsigned short needed_headroom; > > unsigned short needed_tailroom; > > > > + /* Fast path features - via netif_skb_features */ > > netdev_features_t features; > > + netdev_features_t vlan_features; /* if skb_vlan_tagged */ > > + netdev_features_t hw_enc_features; /* if skb->encapsulation */ > > + netdev_features_t gso_partial_features;/* if skb_is_gso */ > > + netdev_features_t mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */ > > + > > + /* Control path features */ > > netdev_features_t hw_features; > > netdev_features_t wanted_features; > > - netdev_features_t vlan_features; > > - netdev_features_t hw_enc_features; > > - netdev_features_t mpls_features; > > - netdev_features_t gso_partial_features; > > > > unsigned int min_mtu; > > unsigned int max_mtu; > > > > > > > Please also note we currently have at least 3 distinct blocks for tx path. > > Presumably netdev_features_t are only used in TX, so should be grouped with the other TX > sections. > > > /* --- cacheline 3 boundary (192 bytes) --- */ > ... > netdev_features_t features; /* 0xe0 0x8 */ > > ... Lots of ctrl stuff.... > > > /* --- cacheline 14 boundary (896 bytes) --- */ > struct netdev_queue * _tx __attribute__((__aligned__(64))); /* 0x380 0x8 */ > > > .... > > /* Mix of unrelated control stuff like rtnl_link_ops > > /* --- cacheline 31 boundary (1984 bytes) --- */ > unsigned int gso_max_size; /* 0x7c0 0x4 */ > u16 gso_max_segs; /* 0x7c4 0x2 */ > > > > Ideally we should move _all_ control/slow_path stuff at the very end of the structure, > in order to not pollute the cache lines we need for data path, to keep them as small > and packed as possible. > > This could be done one field at a time, to ease code review. > > We should have something like this > > /* section used in RX (fast) path */ > /* section used in both RX/TX (fast) path */ > /* section used in TX (fast) path */ > /* section used for slow path, and control path */ I fully agree with above, but this is the long term plan, that I have added to my TODO list. This patch is a followup to commit 28af22c6c8df ("net: adjust net_device layout for cacheline usage") for fixing that the feature members got partitioned into two cache-lines. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index bfadf3b82f9c..3898bb167579 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1890,13 +1890,16 @@ struct net_device { unsigned short needed_headroom; unsigned short needed_tailroom; + /* Fast path features - via netif_skb_features */ netdev_features_t features; + netdev_features_t vlan_features; /* if skb_vlan_tagged */ + netdev_features_t hw_enc_features; /* if skb->encapsulation */ + netdev_features_t gso_partial_features;/* if skb_is_gso */ + netdev_features_t mpls_features; /* if eth_p_mpls+NET_MPLS_GSO */ + + /* Control path features */ netdev_features_t hw_features; netdev_features_t wanted_features; - netdev_features_t vlan_features; - netdev_features_t hw_enc_features; - netdev_features_t mpls_features; - netdev_features_t gso_partial_features; unsigned int min_mtu; unsigned int max_mtu;
As Eric pointed out in response to commit 28af22c6c8df ("net: adjust net_device layout for cacheline usage") the netdev_features_t members wanted_features and hw_features are only used in control path. Thus, this patch reorder the netdev_features_t to let more members that are used in fast path into the 3rd cacheline. Whether these members are read depend on SKB properties, which are hinted as comments. The member mpls_features could not fit in the cacheline, but it was the least commonly used (depend on CONFIG_NET_MPLS_GSO). In the future we should consider relocating member gso_partial_features to be closer to member gso_max_segs. (see usage in gso_features_check()). Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- include/linux/netdevice.h | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)