Message ID | 20210106215539.2103688-2-jesse.brandeburg@intel.com |
---|---|
State | New |
Headers | show |
Series | [net-next,v1,1/2] net: core: count drops from GRO | expand |
On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > When drivers call the various receive upcalls to receive an skb > to the stack, sometimes that stack can drop the packet. The good > news is that the return code is given to all the drivers of > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > the one "ice" driver that I changed, check the stat and increment If the stack is dropping the packet, isn't it up to the stack to track that, perhaps with something that shows up in netstat -s? We don't really want to make the driver responsible for any drops that happen above its head, do we? sln > the dropped count. This is currently leading to packets that > arrive at the edge interface and are fully handled by the driver > and then mysteriously disappear. > > Rather than fix all drivers to increment the drop stat when > handling the return code, emulate the already existing statistic > update for NET_RX_DROP events for the two GRO_DROP locations, and > increment the dev->rx_dropped associated with the skb. > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041..ef34043a9550 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > kfree_skb(skb); > break; > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > napi_reuse_skb(napi, skb); > break; >
On Wed, Jan 6, 2021 at 10:56 PM Jesse Brandeburg <jesse.brandeburg@intel.com> wrote: > > When drivers call the various receive upcalls to receive an skb > to the stack, sometimes that stack can drop the packet. The good > news is that the return code is given to all the drivers of > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > the one "ice" driver that I changed, check the stat and increment > the dropped count. This is currently leading to packets that > arrive at the edge interface and are fully handled by the driver > and then mysteriously disappear. > > Rather than fix all drivers to increment the drop stat when > handling the return code, emulate the already existing statistic > update for NET_RX_DROP events for the two GRO_DROP locations, and > increment the dev->rx_dropped associated with the skb. > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041..ef34043a9550 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > kfree_skb(skb); > break; > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > napi_reuse_skb(napi, skb); > break; > This is not needed. I think we should clean up ice instead. Drivers are supposed to have allocated the skb (using napi_get_frags()) before calling napi_gro_frags() Only napi_gro_frags() would return GRO_DROP, but we supposedly could crash at that point, since a driver is clearly buggy. We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. diff --git a/net/core/dev.c b/net/core/dev.c index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) gro_result_t ret; struct sk_buff *skb = napi_frags_skb(napi); - if (!skb) - return GRO_DROP; - trace_napi_gro_frags_entry(skb); ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb));
Shannon Nelson wrote: > On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > > When drivers call the various receive upcalls to receive an skb > > to the stack, sometimes that stack can drop the packet. The good > > news is that the return code is given to all the drivers of > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > > the one "ice" driver that I changed, check the stat and increment > > If the stack is dropping the packet, isn't it up to the stack to track > that, perhaps with something that shows up in netstat -s? We don't > really want to make the driver responsible for any drops that happen > above its head, do we? I totally agree! In patch 2/2 I revert the driver-specific changes I had made in an earlier patch, and this patch *was* my effort to make the stack show the drops. Maybe I wasn't clear. I'm seeing packets disappear during TCP workloads, and this GRO_DROP code was the source of the drops (I see it returning infrequently but regularly) The driver processes the packet but the stack never sees it, and there were no drop counters anywhere tracking it.
Eric Dumazet wrote: > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > > break; > > > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > kfree_skb(skb); > > break; > > > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > > break; > > > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > napi_reuse_skb(napi, skb); > > break; > > > > > This is not needed. I think we should clean up ice instead. My patch 2 already did that. I was trying to address the fact that I'm *actually seeing* GRO_DROP return codes coming back from stack. I'll try to reproduce that issue again that I saw. Maybe modern kernels don't have the problem as frequently or at all. > Drivers are supposed to have allocated the skb (using > napi_get_frags()) before calling napi_gro_frags() ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this is relevant. > Only napi_gro_frags() would return GRO_DROP, but we supposedly could > crash at that point, since a driver is clearly buggy. seems unlikely since we don't call those functions. > We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. This might be ok, but doesn't explain why I was seeing this return code (which was the whole reason I was trying to count them), however I may have been running on a distro kernel from redhat/centos 8 when I was seeing these events. I haven't fully completed spelunking all the different sources, but might be able to follow down the rabbit hole further. > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c > 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) > gro_result_t ret; > struct sk_buff *skb = napi_frags_skb(napi); > > - if (!skb) > - return GRO_DROP; > - > trace_napi_gro_frags_entry(skb); > > ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); This change (noted from your other patches is fine), and a likely improvement, thanks for sending those!
On Fri, Jan 8, 2021 at 7:35 PM Jesse Brandeburg <jesse.brandeburg@intel.com> wrote: > > Eric Dumazet wrote: > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > > > break; > > > > > > case GRO_DROP: > > > + atomic_long_inc(&skb->dev->rx_dropped); > > > kfree_skb(skb); > > > break; > > > > > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > > > break; > > > > > > case GRO_DROP: > > > + atomic_long_inc(&skb->dev->rx_dropped); > > > napi_reuse_skb(napi, skb); > > > break; > > > > > > > > > This is not needed. I think we should clean up ice instead. > > My patch 2 already did that. I was trying to address the fact that I'm > *actually seeing* GRO_DROP return codes coming back from stack. > > I'll try to reproduce that issue again that I saw. Maybe modern kernels > don't have the problem as frequently or at all. Jesse, you are sending a patch for current kernels. It is pretty clear that the issue you have can not happen with current kernels, by reading the code source, even without an actual ICE piece of hardware to test this :) > > > Drivers are supposed to have allocated the skb (using > > napi_get_frags()) before calling napi_gro_frags() > > ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this > is relevant. > > > Only napi_gro_frags() would return GRO_DROP, but we supposedly could > > crash at that point, since a driver is clearly buggy. > > seems unlikely since we don't call those functions. > > > We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. > > This might be ok, but doesn't explain why I was seeing this return > code (which was the whole reason I was trying to count them), however I > may have been running on a distro kernel from redhat/centos 8 when I > was seeing these events. I haven't fully completed spelunking all the > different sources, but might be able to follow down the rabbit hole > further. Yes please :) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c > > 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) > > gro_result_t ret; > > struct sk_buff *skb = napi_frags_skb(napi); > > > > - if (!skb) > > - return GRO_DROP; > > - > > trace_napi_gro_frags_entry(skb); > > > > ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); > > This change (noted from your other patches is fine), and a likely > improvement, thanks for sending those! Sure !
On 1/8/21 10:26 AM, Jesse Brandeburg wrote: > Shannon Nelson wrote: > >> On 1/6/21 1:55 PM, Jesse Brandeburg wrote: >>> When drivers call the various receive upcalls to receive an skb >>> to the stack, sometimes that stack can drop the packet. The good >>> news is that the return code is given to all the drivers of >>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except >>> the one "ice" driver that I changed, check the stat and increment >> If the stack is dropping the packet, isn't it up to the stack to track >> that, perhaps with something that shows up in netstat -s? We don't >> really want to make the driver responsible for any drops that happen >> above its head, do we? > I totally agree! > > In patch 2/2 I revert the driver-specific changes I had made in an > earlier patch, and this patch *was* my effort to make the stack show the > drops. > > Maybe I wasn't clear. I'm seeing packets disappear during TCP > workloads, and this GRO_DROP code was the source of the drops (I see it > returning infrequently but regularly) > > The driver processes the packet but the stack never sees it, and there > were no drop counters anywhere tracking it. > My point is that the patch increments a netdev counter, which to my mind immediately implicates the driver and hardware, rather than the stack. As a driver maintainer, I don't want to be chasing driver packet drop reports that are a stack problem. I'd rather see a new counter in netstat -s that reflects the stack decision and can better imply what went wrong. I don't have a good suggestion for a counter name at the moment. I guess part of the issue is that this is right on the boundary of driver-stack. But if we follow Eric's suggestions, maybe the problem magically goes away :-) . sln
On Fri, 2021-01-08 at 11:21 -0800, Shannon Nelson wrote: > On 1/8/21 10:26 AM, Jesse Brandeburg wrote: > > Shannon Nelson wrote: > > > > > On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > > > > When drivers call the various receive upcalls to receive an skb > > > > to the stack, sometimes that stack can drop the packet. The > > > > good > > > > news is that the return code is given to all the drivers of > > > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > > > > the one "ice" driver that I changed, check the stat and > > > > increment > > > If the stack is dropping the packet, isn't it up to the stack to > > > track > > > that, perhaps with something that shows up in netstat -s? We > > > don't > > > really want to make the driver responsible for any drops that > > > happen > > > above its head, do we? > > I totally agree! > > > > In patch 2/2 I revert the driver-specific changes I had made in an > > earlier patch, and this patch *was* my effort to make the stack > > show the > > drops. > > > > Maybe I wasn't clear. I'm seeing packets disappear during TCP > > workloads, and this GRO_DROP code was the source of the drops (I > > see it > > returning infrequently but regularly) > > > > The driver processes the packet but the stack never sees it, and > > there > > were no drop counters anywhere tracking it. > > > > My point is that the patch increments a netdev counter, which to my > mind > immediately implicates the driver and hardware, rather than the > stack. > As a driver maintainer, I don't want to be chasing driver packet > drop > reports that are a stack problem. I'd rather see a new counter in > netstat -s that reflects the stack decision and can better imply > what > went wrong. I don't have a good suggestion for a counter name at > the > moment. > > I guess part of the issue is that this is right on the boundary of > driver-stack. But if we follow Eric's suggestions, maybe the > problem > magically goes away :-) . > > sln > I think there is still some merit in this patchset even with Eric's removal of GRO_DROP from gro_receive(). As Eric explained, it is still possible to silently drop for the same reason when drivers call napi_get_frags or even alloc_skb() apis, many drivers do not account for such packet drops, and maybe it is the right thing to do to inline the packet drop accounting into the skb alloc APIs ? the question is, is it the job of those APIs to update netdev->stats ?
On Fri, Jan 8, 2021 at 9:27 PM Saeed Mahameed <saeed@kernel.org> wrote: > > I think there is still some merit in this patchset even with Eric's > removal of GRO_DROP from gro_receive(). As Eric explained, it is still > possible to silently drop for the same reason when drivers > call napi_get_frags or even alloc_skb() apis, many drivers do not > account for such packet drops, and maybe it is the right thing to do to > inline the packet drop accounting into the skb alloc APIs ? the > question is, is it the job of those APIs to update netdev->stats ? > You absolutely do not want to have a generic increment of netdev->stats for multiqueue drivers. This would add terrible cache line false sharing under DDOS and memory stress. Each driver maintains (or should maintain) per rx queue counter for this case. It seems mlx4 does nothing special, I would suggest you fix it :)
On 2021-01-08 2:21 p.m., Shannon Nelson wrote: > On 1/8/21 10:26 AM, Jesse Brandeburg wrote: >> Shannon Nelson wrote: >> >>> On 1/6/21 1:55 PM, Jesse Brandeburg wrote: >>>> When drivers call the various receive upcalls to receive an skb >>>> to the stack, sometimes that stack can drop the packet. The good >>>> news is that the return code is given to all the drivers of >>>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except >>>> the one "ice" driver that I changed, check the stat and increment >>> If the stack is dropping the packet, isn't it up to the stack to track >>> that, perhaps with something that shows up in netstat -s? We don't >>> really want to make the driver responsible for any drops that happen >>> above its head, do we? >> I totally agree! >> >> In patch 2/2 I revert the driver-specific changes I had made in an >> earlier patch, and this patch *was* my effort to make the stack show the >> drops. >> >> Maybe I wasn't clear. I'm seeing packets disappear during TCP >> workloads, and this GRO_DROP code was the source of the drops (I see it >> returning infrequently but regularly) >> >> The driver processes the packet but the stack never sees it, and there >> were no drop counters anywhere tracking it. >> > > My point is that the patch increments a netdev counter, which to my mind > immediately implicates the driver and hardware, rather than the stack. > As a driver maintainer, I don't want to be chasing driver packet drop > reports that are a stack problem. I'd rather see a new counter in > netstat -s that reflects the stack decision and can better imply what > went wrong. I don't have a good suggestion for a counter name at the > moment. > > I guess part of the issue is that this is right on the boundary of > driver-stack. But if we follow Eric's suggestions, maybe the problem > magically goes away :-) . > So: How does one know that the stack-upcall dropped a packet because of GRO issues? Debugging with kprobe or traces doesnt count as an answer. cheers, jamal
diff --git a/net/core/dev.c b/net/core/dev.c index 8fa739259041..ef34043a9550 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, break; case GRO_DROP: + atomic_long_inc(&skb->dev->rx_dropped); kfree_skb(skb); break; @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, break; case GRO_DROP: + atomic_long_inc(&skb->dev->rx_dropped); napi_reuse_skb(napi, skb); break;
When drivers call the various receive upcalls to receive an skb to the stack, sometimes that stack can drop the packet. The good news is that the return code is given to all the drivers of NET_RX_DROP or GRO_DROP. The bad news is that no drivers except the one "ice" driver that I changed, check the stat and increment the dropped count. This is currently leading to packets that arrive at the edge interface and are fully handled by the driver and then mysteriously disappear. Rather than fix all drivers to increment the drop stat when handling the return code, emulate the already existing statistic update for NET_RX_DROP events for the two GRO_DROP locations, and increment the dev->rx_dropped associated with the skb. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> --- net/core/dev.c | 2 ++ 1 file changed, 2 insertions(+)