Message ID | 1599826106-19020-1-git-send-email-magnus.karlsson@gmail.com |
---|---|
State | New |
Headers | show |
Series | [net-next] i40e: allow VMDQs to be used with AF_XDP zero-copy | expand |
On Fri, Sep 11, 2020 at 2:11 PM Maciej Fijalkowski <maciej.fijalkowski@intel.com> wrote: > > On Fri, Sep 11, 2020 at 02:08:26PM +0200, Magnus Karlsson wrote: > > From: Magnus Karlsson <magnus.karlsson@intel.com> > > > > Allow VMDQs to be used with AF_XDP sockets in zero-copy mode. For some > > reason, we only allowed main VSIs to be used with zero-copy, but > > there is now reason to not allow VMDQs also. > > You meant 'to allow' I suppose. And what reason? :) Yes, sorry. Should be "not to allow". I was too trigger happy ;-). I have gotten requests from users that they want to use VMDQs in conjunction with containers. Basically small slices of the i40e portioned out as netdevs. Do you see any problems with using a VMDQ iwth zero-copy? /Magnus > > > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> > > --- > > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > index 2a1153d..ebe15ca 100644 > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > @@ -45,7 +45,7 @@ static int i40e_xsk_pool_enable(struct i40e_vsi *vsi, > > bool if_running; > > int err; > > > > - if (vsi->type != I40E_VSI_MAIN) > > + if (!(vsi->type == I40E_VSI_MAIN || vsi->type == I40E_VSI_VMDQ2)) > > return -EINVAL; > > > > if (qid >= vsi->num_queue_pairs) > > -- > > 2.7.4 > >
On Fri, Sep 11, 2020 at 02:29:50PM +0200, Magnus Karlsson wrote: > On Fri, Sep 11, 2020 at 2:11 PM Maciej Fijalkowski > <maciej.fijalkowski@intel.com> wrote: > > > > On Fri, Sep 11, 2020 at 02:08:26PM +0200, Magnus Karlsson wrote: > > > From: Magnus Karlsson <magnus.karlsson@intel.com> > > > > > > Allow VMDQs to be used with AF_XDP sockets in zero-copy mode. For some > > > reason, we only allowed main VSIs to be used with zero-copy, but > > > there is now reason to not allow VMDQs also. > > > > You meant 'to allow' I suppose. And what reason? :) > > Yes, sorry. Should be "not to allow". I was too trigger happy ;-). > > I have gotten requests from users that they want to use VMDQs in > conjunction with containers. Basically small slices of the i40e > portioned out as netdevs. Do you see any problems with using a VMDQ > iwth zero-copy? No, I only meant to provide the actual reason (what you wrote above) in the commit message. > > /Magnus > > > > > > > Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> > > > --- > > > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > > index 2a1153d..ebe15ca 100644 > > > --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > > +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c > > > @@ -45,7 +45,7 @@ static int i40e_xsk_pool_enable(struct i40e_vsi *vsi, > > > bool if_running; > > > int err; > > > > > > - if (vsi->type != I40E_VSI_MAIN) > > > + if (!(vsi->type == I40E_VSI_MAIN || vsi->type == I40E_VSI_VMDQ2)) > > > return -EINVAL; > > > > > > if (qid >= vsi->num_queue_pairs) > > > -- > > > 2.7.4 > > >
On Fri, Sep 11, 2020 at 11:05 AM Samudrala, Sridhar <sridhar.samudrala@intel.com> wrote: > > > > On 9/11/2020 6:10 AM, Maciej Fijalkowski wrote: > > On Fri, Sep 11, 2020 at 02:29:50PM +0200, Magnus Karlsson wrote: > >> On Fri, Sep 11, 2020 at 2:11 PM Maciej Fijalkowski > >> <maciej.fijalkowski@intel.com> wrote: > >>> > >>> On Fri, Sep 11, 2020 at 02:08:26PM +0200, Magnus Karlsson wrote: > >>>> From: Magnus Karlsson <magnus.karlsson@intel.com> > >>>> > >>>> Allow VMDQs to be used with AF_XDP sockets in zero-copy mode. For some > >>>> reason, we only allowed main VSIs to be used with zero-copy, but > >>>> there is now reason to not allow VMDQs also. > >>> > >>> You meant 'to allow' I suppose. And what reason? :) > >> > >> Yes, sorry. Should be "not to allow". I was too trigger happy ;-). > >> > >> I have gotten requests from users that they want to use VMDQs in > >> conjunction with containers. Basically small slices of the i40e > >> portioned out as netdevs. Do you see any problems with using a VMDQ > >> iwth zero-copy? > > Today VMDQ VSIs are used when a macvlan interface is created on top of a > i40e PF with l2-fwd-offload on. But i don't think we can create an > AF_XDP zerocopy socket on top of a macvlan netdev as it doesn't support > ndo_bpf or ndo_xdp_xxx apis or expose hw queues directly. > > We need to expose VMDQ VSI as a native netdev that can expose its own > queues and support ndo_ ops in order to enable AF_XDP zerocopy on a > VMDQ. We talked about this approach at the recent netdev conference to > expose VMDQ VSI as a subdevice with its own netdev. > > https://netdevconf.info/0x14/session.html?talk-hardware-acceleration-of-container-networking-interfaces I still hold the opinion that macvlan is still the best way to go about addressing most of these needs. The problem with doing isolation as separate netdevs is the fact that east/west traffic starts to essentially swamp the PCIe bus on the device as you have to deal with broadcast/multicast replication and east/west traffic. Leaving that replication and east/west traffic up to software to handle while allowing the unicast traffic to be directed is the best way to go in my opinion. The problem with just spawning netdevs is that each vendor can do it differently and what you get varies in functionality. If anything we would need to come up with a standardized interface to define what features can be used and exposed. That was one of the motivations behind using macvlan. So if anything it seems like it might make more sense to look at extending the macvlan interface to enable offloading additional features to the lower level device. With that said I am not certain VMDq is even the right kind of interface to use for containers. I would be more interested in something like what we did in fm10k for macvlan offload where we used resource tags to identify traffic that belonged to a given interface and just dedicated that to it rather than queues and interrupts. The problem with dedicating queues and interrupts is that those are a limited resource so scaling will become an issue when you get to any decent count of containers. - Alex
On Fri, Sep 11, 2020 at 8:42 PM Alexander Duyck <alexander.duyck@gmail.com> wrote: > > On Fri, Sep 11, 2020 at 11:05 AM Samudrala, Sridhar > <sridhar.samudrala@intel.com> wrote: > > > > > > > > On 9/11/2020 6:10 AM, Maciej Fijalkowski wrote: > > > On Fri, Sep 11, 2020 at 02:29:50PM +0200, Magnus Karlsson wrote: > > >> On Fri, Sep 11, 2020 at 2:11 PM Maciej Fijalkowski > > >> <maciej.fijalkowski@intel.com> wrote: > > >>> > > >>> On Fri, Sep 11, 2020 at 02:08:26PM +0200, Magnus Karlsson wrote: > > >>>> From: Magnus Karlsson <magnus.karlsson@intel.com> > > >>>> > > >>>> Allow VMDQs to be used with AF_XDP sockets in zero-copy mode. For some > > >>>> reason, we only allowed main VSIs to be used with zero-copy, but > > >>>> there is now reason to not allow VMDQs also. > > >>> > > >>> You meant 'to allow' I suppose. And what reason? :) > > >> > > >> Yes, sorry. Should be "not to allow". I was too trigger happy ;-). > > >> > > >> I have gotten requests from users that they want to use VMDQs in > > >> conjunction with containers. Basically small slices of the i40e > > >> portioned out as netdevs. Do you see any problems with using a VMDQ > > >> iwth zero-copy? > > > > Today VMDQ VSIs are used when a macvlan interface is created on top of a > > i40e PF with l2-fwd-offload on. But i don't think we can create an > > AF_XDP zerocopy socket on top of a macvlan netdev as it doesn't support > > ndo_bpf or ndo_xdp_xxx apis or expose hw queues directly. > > > > We need to expose VMDQ VSI as a native netdev that can expose its own > > queues and support ndo_ ops in order to enable AF_XDP zerocopy on a > > VMDQ. We talked about this approach at the recent netdev conference to > > expose VMDQ VSI as a subdevice with its own netdev. > > > > https://netdevconf.info/0x14/session.html?talk-hardware-acceleration-of-container-networking-interfaces > > I still hold the opinion that macvlan is still the best way to go > about addressing most of these needs. The problem with doing isolation > as separate netdevs is the fact that east/west traffic starts to > essentially swamp the PCIe bus on the device as you have to deal with > broadcast/multicast replication and east/west traffic. Leaving that > replication and east/west traffic up to software to handle while > allowing the unicast traffic to be directed is the best way to go in > my opinion. > > The problem with just spawning netdevs is that each vendor can do it > differently and what you get varies in functionality. If anything we > would need to come up with a standardized interface to define what > features can be used and exposed. That was one of the motivations > behind using macvlan. So if anything it seems like it might make more > sense to look at extending the macvlan interface to enable offloading > additional features to the lower level device. Agree with this completely. This patch was not intended to "solve" the container interface problem. This solution does not scale, is proprietary, etc, etc. It just uses something, VMDQs, that was put in the i40e driver a long time ago. Do not know the history behind it, but I am sure that Alex and Sridhar do. Anyway, what I believe you and Jakub are saying is that this is just extending something that we all know is a dead end, or in other words, putting lipstick on a pig ;-). Please drop the patch. > With that said I am not certain VMDq is even the right kind of > interface to use for containers. I would be more interested in > something like what we did in fm10k for macvlan offload where we used > resource tags to identify traffic that belonged to a given interface > and just dedicated that to it rather than queues and interrupts. The > problem with dedicating queues and interrupts is that those are a > limited resource so scaling will become an issue when you get to any > decent count of containers. > > - Alex
diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c index 2a1153d..ebe15ca 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c @@ -45,7 +45,7 @@ static int i40e_xsk_pool_enable(struct i40e_vsi *vsi, bool if_running; int err; - if (vsi->type != I40E_VSI_MAIN) + if (!(vsi->type == I40E_VSI_MAIN || vsi->type == I40E_VSI_VMDQ2)) return -EINVAL; if (qid >= vsi->num_queue_pairs)