diff mbox series

virtio_net: Remove BUG() to aviod machine dead

Message ID a351fbe1-0233-8515-2927-adc826a7fb94@linux.alibaba.com
State New
Headers show
Series virtio_net: Remove BUG() to aviod machine dead | expand

Commit Message

Xianting Tian May 18, 2021, 9:46 a.m. UTC
When met error, we output a print to avoid a BUG().

Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>
---
  drivers/net/virtio_net.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

  		hdr->num_buffers = 0;

Comments

Stefano Garzarella May 20, 2021, 7:35 a.m. UTC | #1
If you need to respin, there is a typo in the title s/aviod/avoid/

On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:
>When met error, we output a print to avoid a BUG().

>

>Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>

>---

> drivers/net/virtio_net.c | 5 ++---

> 1 file changed, 2 insertions(+), 3 deletions(-)

>

>diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c

>index c921ebf3ae82..a66174d13e81 100644

>--- a/drivers/net/virtio_net.c

>+++ b/drivers/net/virtio_net.c

>@@ -1647,9 +1647,8 @@ static int xmit_skb(struct send_queue *sq, 

>struct sk_buff *skb)

> 		hdr = skb_vnet_hdr(skb);

>

> 	if (virtio_net_hdr_from_skb(skb, &hdr->hdr,

>-				    virtio_is_little_endian(vi->vdev), false,

>-				    0))

>-		BUG();

>+				virtio_is_little_endian(vi->vdev), false, 0))

                                 ^
                                 This change is not related.

>+		return -EPROTO;

>

> 	if (vi->mergeable_rx_bufs)

> 		hdr->num_buffers = 0;

>-- 

>2.17.1

>
Jason Wang May 25, 2021, 6:19 a.m. UTC | #2
在 2021/5/19 下午10:18, Xianting Tian 写道:
> thanks, I submit the patch as commented by Andrew 

> https://lkml.org/lkml/2021/5/18/256

>

> Actually, if xmit_skb() returns error, below code will give a warning 

> with error code.

>

>     /* Try to transmit */

>     err = xmit_skb(sq, skb);

>

>     /* This should not happen! */

>     if (unlikely(err)) {

>         dev->stats.tx_fifo_errors++;

>         if (net_ratelimit())

>             dev_warn(&dev->dev,

>                  "Unexpected TXQ (%d) queue failure: %d\n",

>                  qnum, err);

>         dev->stats.tx_dropped++;

>         dev_kfree_skb_any(skb);

>         return NETDEV_TX_OK;

>     }

>

>

>

>

>

> 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:

>> typo in subject

>>

>> On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:

>>> When met error, we output a print to avoid a BUG().



So you don't explain why you need to remove BUG(). I think it deserve a 
BUG().


>>>

>>> Signed-off-by: Xianting Tian <xianting.tian@linux.alibaba.com>

>>> ---

>>>   drivers/net/virtio_net.c | 5 ++---

>>>   1 file changed, 2 insertions(+), 3 deletions(-)

>>>

>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c

>>> index c921ebf3ae82..a66174d13e81 100644

>>> --- a/drivers/net/virtio_net.c

>>> +++ b/drivers/net/virtio_net.c

>>> @@ -1647,9 +1647,8 @@ static int xmit_skb(struct send_queue *sq, struct

>>> sk_buff *skb)

>>>           hdr = skb_vnet_hdr(skb);

>>>

>>>       if (virtio_net_hdr_from_skb(skb, &hdr->hdr,

>>> -                    virtio_is_little_endian(vi->vdev), false,

>>> -                    0))

>>> -        BUG();

>>> +                virtio_is_little_endian(vi->vdev), false, 0))

>>> +        return -EPROTO;

>>>

>>

>> why EPROTO? can you add some comments to explain what is going on pls?

>>

>> is this related to a malicious hypervisor thing?



I think not if I was not wrong.

Each sources (either userspace or device), the skb should be built 
through virtio_net_hdr_to_skb() which means the validation has already 
been done.

If we it fails here, it's a real bug.

Thanks


>>

>> don't we want at least a WARN_ON? Or _ONCE?

>>

>>>       if (vi->mergeable_rx_bufs)

>>>           hdr->num_buffers = 0;

>>> -- 

>>> 2.17.1

>
Leon Romanovsky June 2, 2021, 5:59 a.m. UTC | #3
On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote:
> 

> 在 2021/5/19 下午10:18, Xianting Tian 写道:

> > thanks, I submit the patch as commented by Andrew

> > https://lkml.org/lkml/2021/5/18/256

> > 

> > Actually, if xmit_skb() returns error, below code will give a warning

> > with error code.

> > 

> >     /* Try to transmit */

> >     err = xmit_skb(sq, skb);

> > 

> >     /* This should not happen! */

> >     if (unlikely(err)) {

> >         dev->stats.tx_fifo_errors++;

> >         if (net_ratelimit())

> >             dev_warn(&dev->dev,

> >                  "Unexpected TXQ (%d) queue failure: %d\n",

> >                  qnum, err);

> >         dev->stats.tx_dropped++;

> >         dev_kfree_skb_any(skb);

> >         return NETDEV_TX_OK;

> >     }

> > 

> > 

> > 

> > 

> > 

> > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:

> > > typo in subject

> > > 

> > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:

> > > > When met error, we output a print to avoid a BUG().

> 

> 

> So you don't explain why you need to remove BUG(). I think it deserve a

> BUG().


BUG() will crash the machine and virtio_net is not kernel core
functionality that must stop the machine to prevent anything truly
harmful and basic.

I would argue that code in drivers/* shouldn't call BUG() macros at all.

If it is impossible, don't check for that or add WARN_ON() and recover,
but don't crash whole system.

Thanks
Jason Wang June 2, 2021, 7:14 a.m. UTC | #4
在 2021/6/2 下午1:59, Leon Romanovsky 写道:
> On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote:

>> 在 2021/5/19 下午10:18, Xianting Tian 写道:

>>> thanks, I submit the patch as commented by Andrew

>>> https://lkml.org/lkml/2021/5/18/256

>>>

>>> Actually, if xmit_skb() returns error, below code will give a warning

>>> with error code.

>>>

>>>      /* Try to transmit */

>>>      err = xmit_skb(sq, skb);

>>>

>>>      /* This should not happen! */

>>>      if (unlikely(err)) {

>>>          dev->stats.tx_fifo_errors++;

>>>          if (net_ratelimit())

>>>              dev_warn(&dev->dev,

>>>                   "Unexpected TXQ (%d) queue failure: %d\n",

>>>                   qnum, err);

>>>          dev->stats.tx_dropped++;

>>>          dev_kfree_skb_any(skb);

>>>          return NETDEV_TX_OK;

>>>      }

>>>

>>>

>>>

>>>

>>>

>>> 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:

>>>> typo in subject

>>>>

>>>> On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:

>>>>> When met error, we output a print to avoid a BUG().

>>

>> So you don't explain why you need to remove BUG(). I think it deserve a

>> BUG().

> BUG() will crash the machine and virtio_net is not kernel core

> functionality that must stop the machine to prevent anything truly

> harmful and basic.



Note that the BUG() here is not for virtio-net itself. It tells us that 
a bug was found by virtio-net.

That is, the one that produces the skb has a bug, usually it's the 
network core.

There could also be the issue of the packet from untrusted source 
(userspace like TAP or packet socket) but they should be validated there.

Thanks


>

> I would argue that code in drivers/* shouldn't call BUG() macros at all.

>

> If it is impossible, don't check for that or add WARN_ON() and recover,

> but don't crash whole system.

>

> Thanks

>
Leon Romanovsky June 2, 2021, 12:54 p.m. UTC | #5
On Wed, Jun 02, 2021 at 03:14:50PM +0800, Jason Wang wrote:
> 

> 在 2021/6/2 下午1:59, Leon Romanovsky 写道:

> > On Tue, May 25, 2021 at 02:19:03PM +0800, Jason Wang wrote:

> > > 在 2021/5/19 下午10:18, Xianting Tian 写道:

> > > > thanks, I submit the patch as commented by Andrew

> > > > https://lkml.org/lkml/2021/5/18/256

> > > > 

> > > > Actually, if xmit_skb() returns error, below code will give a warning

> > > > with error code.

> > > > 

> > > >      /* Try to transmit */

> > > >      err = xmit_skb(sq, skb);

> > > > 

> > > >      /* This should not happen! */

> > > >      if (unlikely(err)) {

> > > >          dev->stats.tx_fifo_errors++;

> > > >          if (net_ratelimit())

> > > >              dev_warn(&dev->dev,

> > > >                   "Unexpected TXQ (%d) queue failure: %d\n",

> > > >                   qnum, err);

> > > >          dev->stats.tx_dropped++;

> > > >          dev_kfree_skb_any(skb);

> > > >          return NETDEV_TX_OK;

> > > >      }

> > > > 

> > > > 

> > > > 

> > > > 

> > > > 

> > > > 在 2021/5/18 下午5:54, Michael S. Tsirkin 写道:

> > > > > typo in subject

> > > > > 

> > > > > On Tue, May 18, 2021 at 05:46:56PM +0800, Xianting Tian wrote:

> > > > > > When met error, we output a print to avoid a BUG().

> > > 

> > > So you don't explain why you need to remove BUG(). I think it deserve a

> > > BUG().

> > BUG() will crash the machine and virtio_net is not kernel core

> > functionality that must stop the machine to prevent anything truly

> > harmful and basic.

> 

> 

> Note that the BUG() here is not for virtio-net itself. It tells us that a

> bug was found by virtio-net.

> 

> That is, the one that produces the skb has a bug, usually it's the network

> core.

> 

> There could also be the issue of the packet from untrusted source (userspace

> like TAP or packet socket) but they should be validated there.


So it is even worse than I thought. You are saying that in theory untrusted
remote host can crash system. IMHO, It is definitely not the place to put BUG().

I remind you that in-kernel API is build on the promise that data passed
between and calls are safe and already checked. You don't need to set a
protection from the net/core.

Thanks

> 

> Thanks

> 

> 

> > 

> > I would argue that code in drivers/* shouldn't call BUG() macros at all.

> > 

> > If it is impossible, don't check for that or add WARN_ON() and recover,

> > but don't crash whole system.

> > 

> > Thanks

> > 

>
diff mbox series

Patch

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c921ebf3ae82..a66174d13e81 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1647,9 +1647,8 @@  static int xmit_skb(struct send_queue *sq, struct 
sk_buff *skb)
  		hdr = skb_vnet_hdr(skb);

  	if (virtio_net_hdr_from_skb(skb, &hdr->hdr,
-				    virtio_is_little_endian(vi->vdev), false,
-				    0))
-		BUG();
+				virtio_is_little_endian(vi->vdev), false, 0))
+		return -EPROTO;

  	if (vi->mergeable_rx_bufs)