Message ID | a351fbe1-0233-8515-2927-adc826a7fb94@linux.alibaba.com |
---|---|
State | New |
Headers | show |
Series | virtio_net: Remove BUG() to aviod machine dead | expand |
If you need to respin, there is a typo in the title s/aviod/avoid/ On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote: >When met error, we output a print to avoid a BUG(). > >Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> >--- > drivers/net/virtio_net.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > >diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >index c921ebf3ae82..a66174d13e81 100644 >--- a/drivers/net/virtio_net.c >+++ b/drivers/net/virtio_net.c >@@ -1647,9 +1647,8 @@ static int xmit_skb(struct send_queue *sq, >struct sk_buff *skb) > hdr = skb_vnet_hdr(skb); > > if (virtio_net_hdr_from_skb(skb, &hdr->hdr, >- virtio_is_little_endian(vi->vdev), false, >- 0)) >- BUG(); >+ virtio_is_little_endian(vi->vdev), false, 0)) ^ This change is not related. >+ return -EPROTO; > > if (vi->mergeable_rx_bufs) > hdr->num_buffers = 0; >-- >2.17.1 >
在 2021/5/19 下午10:18, Xianting Tian 写道: > thanks, I submit the patch as commented by Andrew > https://lkml.org/lkml/2021/5/18/256 > > Actually, if xmit_skb() returns error, below code will give a warning > with error code. > > /* Try to transmit */ > err = xmit_skb(sq, skb); > > /* This should not happen! */ > if (unlikely(err)) { > dev->stats.tx_fifo_errors++; > if (net_ratelimit()) > dev_warn(&dev->dev, > "Unexpected TXQ (%d) queue failure: %d\n", > qnum, err); > dev->stats.tx_dropped++; > dev_kfree_skb_any(skb); > return NETDEV_TX_OK; > } > > > > > > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道: >> typo in subject >> >> On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote: >>> When met error, we output a print to avoid a BUG(). So you don't explain why you need to remove BUG(). I think it deserve a BUG(). >>> >>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> >>> --- >>> drivers/net/virtio_net.c | 5 ++--- >>> 1 file changed, 2 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index c921ebf3ae82..a66174d13e81 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -1647,9 +1647,8 @@ static int xmit_skb(struct send_queue *sq, struct >>> sk_buff *skb) >>> hdr = skb_vnet_hdr(skb); >>> >>> if (virtio_net_hdr_from_skb(skb, &hdr->hdr, >>> - virtio_is_little_endian(vi->vdev), false, >>> - 0)) >>> - BUG(); >>> + virtio_is_little_endian(vi->vdev), false, 0)) >>> + return -EPROTO; >>> >> >> why EPROTO? can you add some comments to explain what is going on pls? >> >> is this related to a malicious hypervisor thing? I think not if I was not wrong. Each sources (either userspace or device), the skb should be built through virtio_net_hdr_to_skb() which means the validation has already been done. If we it fails here, it's a real bug. Thanks >> >> don't we want at least a WARN_ON? Or _ONCE? >> >>> if (vi->mergeable_rx_bufs) >>> hdr->num_buffers = 0; >>> -- >>> 2.17.1 >
On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote: > > 在 2021/5/19 下午10:18, Xianting Tian 写道: > > thanks, I submit the patch as commented by Andrew > > https://lkml.org/lkml/2021/5/18/256 > > > > Actually, if xmit_skb() returns error, below code will give a warning > > with error code. > > > > /* Try to transmit */ > > err = xmit_skb(sq, skb); > > > > /* This should not happen! */ > > if (unlikely(err)) { > > dev->stats.tx_fifo_errors++; > > if (net_ratelimit()) > > dev_warn(&dev->dev, > > "Unexpected TXQ (%d) queue failure: %d\n", > > qnum, err); > > dev->stats.tx_dropped++; > > dev_kfree_skb_any(skb); > > return NETDEV_TX_OK; > > } > > > > > > > > > > > > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道: > > > typo in subject > > > > > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote: > > > > When met error, we output a print to avoid a BUG(). > > > So you don't explain why you need to remove BUG(). I think it deserve a > BUG(). BUG() will crash the machine and virtio_net is not kernel core functionality that must stop the machine to prevent anything truly harmful and basic. I would argue that code in drivers/* shouldn't call BUG() macros at all. If it is impossible, don't check for that or add WARN_ON() and recover, but don't crash whole system. Thanks
在 2021/6/2 下午1:59, Leon Romanovsky 写道: > On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote: >> 在 2021/5/19 下午10:18, Xianting Tian 写道: >>> thanks, I submit the patch as commented by Andrew >>> https://lkml.org/lkml/2021/5/18/256 >>> >>> Actually, if xmit_skb() returns error, below code will give a warning >>> with error code. >>> >>> /* Try to transmit */ >>> err = xmit_skb(sq, skb); >>> >>> /* This should not happen! */ >>> if (unlikely(err)) { >>> dev->stats.tx_fifo_errors++; >>> if (net_ratelimit()) >>> dev_warn(&dev->dev, >>> "Unexpected TXQ (%d) queue failure: %d\n", >>> qnum, err); >>> dev->stats.tx_dropped++; >>> dev_kfree_skb_any(skb); >>> return NETDEV_TX_OK; >>> } >>> >>> >>> >>> >>> >>> 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道: >>>> typo in subject >>>> >>>> On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote: >>>>> When met error, we output a print to avoid a BUG(). >> >> So you don't explain why you need to remove BUG(). I think it deserve a >> BUG(). > BUG() will crash the machine and virtio_net is not kernel core > functionality that must stop the machine to prevent anything truly > harmful and basic. Note that the BUG() here is not for virtio-net itself. It tells us that a bug was found by virtio-net. That is, the one that produces the skb has a bug, usually it's the network core. There could also be the issue of the packet from untrusted source (userspace like TAP or packet socket) but they should be validated there. Thanks > > I would argue that code in drivers/* shouldn't call BUG() macros at all. > > If it is impossible, don't check for that or add WARN_ON() and recover, > but don't crash whole system. > > Thanks >
On Wed, Jun 02, 2021 at 03:14:50PM +0800, Jason Wang wrote: > > 在 2021/6/2 下午1:59, Leon Romanovsky 写道: > > On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote: > > > 在 2021/5/19 下午10:18, Xianting Tian 写道: > > > > thanks, I submit the patch as commented by Andrew > > > > https://lkml.org/lkml/2021/5/18/256 > > > > > > > > Actually, if xmit_skb() returns error, below code will give a warning > > > > with error code. > > > > > > > > /* Try to transmit */ > > > > err = xmit_skb(sq, skb); > > > > > > > > /* This should not happen! */ > > > > if (unlikely(err)) { > > > > dev->stats.tx_fifo_errors++; > > > > if (net_ratelimit()) > > > > dev_warn(&dev->dev, > > > > "Unexpected TXQ (%d) queue failure: %d\n", > > > > qnum, err); > > > > dev->stats.tx_dropped++; > > > > dev_kfree_skb_any(skb); > > > > return NETDEV_TX_OK; > > > > } > > > > > > > > > > > > > > > > > > > > > > > > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道: > > > > > typo in subject > > > > > > > > > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote: > > > > > > When met error, we output a print to avoid a BUG(). > > > > > > So you don't explain why you need to remove BUG(). I think it deserve a > > > BUG(). > > BUG() will crash the machine and virtio_net is not kernel core > > functionality that must stop the machine to prevent anything truly > > harmful and basic. > > > Note that the BUG() here is not for virtio-net itself. It tells us that a > bug was found by virtio-net. > > That is, the one that produces the skb has a bug, usually it's the network > core. > > There could also be the issue of the packet from untrusted source (userspace > like TAP or packet socket) but they should be validated there. So it is even worse than I thought. You are saying that in theory untrusted remote host can crash system. IMHO, It is definitely not the place to put BUG(). I remind you that in-kernel API is build on the promise that data passed between and calls are safe and already checked. You don't need to set a protection from the net/core. Thanks > > Thanks > > > > > > I would argue that code in drivers/* shouldn't call BUG() macros at all. > > > > If it is impossible, don't check for that or add WARN_ON() and recover, > > but don't crash whole system. > > > > Thanks > > >
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index c921ebf3ae82..a66174d13e81 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1647,9 +1647,8 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb) hdr = skb_vnet_hdr(skb); if (virtio_net_hdr_from_skb(skb, &hdr->hdr, - virtio_is_little_endian(vi->vdev), false, - 0)) - BUG(); + virtio_is_little_endian(vi->vdev), false, 0)) + return -EPROTO; if (vi->mergeable_rx_bufs)
When met error, we output a print to avoid a BUG(). Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com> --- drivers/net/virtio_net.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) hdr->num_buffers = 0;