mbox series

[net,0/4] bnxt: Tx NAPI disabling resiliency improvements

Message ID 20210811193239.3155396-1-kuba@kernel.org
Headers show
Series bnxt: Tx NAPI disabling resiliency improvements | expand

Message

Jakub Kicinski Aug. 11, 2021, 7:32 p.m. UTC
A lockdep warning was triggered by netpoll because napi poll
was taking the xmit lock. Fix that and a couple more issues
noticed while reading the code.

Jakub Kicinski (4):
  bnxt: don't lock the tx queue from napi poll
  bnxt: disable napi before cancelling DIM
  bnxt: make sure xmit_more + errors does not miss doorbells
  bnxt: count Tx drops

 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 59 ++++++++++++++---------
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  1 +
 2 files changed, 36 insertions(+), 24 deletions(-)

Comments

Edwin Peer Aug. 11, 2021, 8:13 p.m. UTC | #1
On Wed, Aug 11, 2021 at 12:32 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> We can't take the tx lock from the napi poll routine, because
> netpoll can poll napi at any moment, including with the tx lock
> already held.
>
> It seems that the tx lock is only protecting against the disable
> path, appropriate barriers are already in place to make sure
> cleanup can safely run concurrently with start_xmit. I don't see
> any other reason why 'stopped && avail > thresh' needs to be
> re-checked under the lock.
>
> Remove the tx lock and use synchronize_net() to make sure
> closing the device does not race we restarting the queues.
> Annotate accesses to dev_state against data races.
>
> Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 865fcb8cf29f..07827d6b0fec 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -730,15 +730,10 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
>          */
>         smp_mb();
>
> -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> -               __netif_tx_lock(txq, smp_processor_id());
> -               if (netif_tx_queue_stopped(txq) &&
> -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> -                       netif_tx_wake_queue(txq);
> -               __netif_tx_unlock(txq);
> -       }
> +       if (netif_tx_queue_stopped(txq) &&

Probably worth retaining the unlikely() that the original outer check had.

> +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)
> +               netif_tx_wake_queue(txq);
>  }
>
>  static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,
> @@ -9264,9 +9259,11 @@ void bnxt_tx_disable(struct bnxt *bp)
>         if (bp->tx_ring) {
>                 for (i = 0; i < bp->tx_nr_rings; i++) {
>                         txr = &bp->tx_ring[i];
> -                       txr->dev_state = BNXT_DEV_STATE_CLOSING;
> +                       WRITE_ONCE(txr->dev_state, BNXT_DEV_STATE_CLOSING);
>                 }
>         }
> +       /* Make sure napi polls see @dev_state change */
> +       synchronize_net();
>         /* Drop carrier first to prevent TX timeout */
>         netif_carrier_off(bp->dev);
>         /* Stop all TX queues */
> @@ -9280,8 +9277,10 @@ void bnxt_tx_enable(struct bnxt *bp)
>
>         for (i = 0; i < bp->tx_nr_rings; i++) {
>                 txr = &bp->tx_ring[i];
> -               txr->dev_state = 0;
> +               WRITE_ONCE(txr->dev_state, 0);
>         }
> +       /* Make sure napi polls see @dev_state change */
> +       synchronize_net();
>         netif_tx_wake_all_queues(bp->dev);
>         if (bp->link_info.link_up)
>                 netif_carrier_on(bp->dev);
> --
> 2.31.1

Regards,
Edwin Peer
Michael Chan Aug. 11, 2021, 8:42 p.m. UTC | #2
On Wed, Aug 11, 2021 at 12:32 PM Jakub Kicinski <kuba@kernel.org> wrote:

> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 865fcb8cf29f..07827d6b0fec 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -730,15 +730,10 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
>          */
>         smp_mb();
>
> -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> -               __netif_tx_lock(txq, smp_processor_id());
> -               if (netif_tx_queue_stopped(txq) &&
> -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> -                       netif_tx_wake_queue(txq);
> -               __netif_tx_unlock(txq);
> -       }
> +       if (netif_tx_queue_stopped(txq) &&
> +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)

This can race with bnxt_start_xmit().  bnxt_start_xmit() can also wake
up the queue when it sees that descriptors are available.  I think
this is the reason we added tx locking here.  The race may be ok
because in the worst case, we will wake up the TX queue when it's not
supposed to wakeup.  If that happens, bnxt_start_xmit() will return
NETDEV_TX_BUSY and stop the queue again when there are not enough TX
descriptors.

> +               netif_tx_wake_queue(txq);
>  }
>
>  static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping,
Jakub Kicinski Aug. 11, 2021, 9:09 p.m. UTC | #3
On Wed, 11 Aug 2021 13:42:40 -0700 Michael Chan wrote:
> > -       if (unlikely(netif_tx_queue_stopped(txq)) &&
> > -           (bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh)) {
> > -               __netif_tx_lock(txq, smp_processor_id());
> > -               if (netif_tx_queue_stopped(txq) &&
> > -                   bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> > -                   txr->dev_state != BNXT_DEV_STATE_CLOSING)
> > -                       netif_tx_wake_queue(txq);
> > -               __netif_tx_unlock(txq);
> > -       }
> > +       if (netif_tx_queue_stopped(txq) &&
> > +           bnxt_tx_avail(bp, txr) > bp->tx_wake_thresh &&
> > +           READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING)  
> 
> This can race with bnxt_start_xmit().  bnxt_start_xmit() can also wake
> up the queue when it sees that descriptors are available.  I think
> this is the reason we added tx locking here.  The race may be ok
> because in the worst case, we will wake up the TX queue when it's not
> supposed to wakeup.  If that happens, bnxt_start_xmit() will return
> NETDEV_TX_BUSY and stop the queue again when there are not enough TX
> descriptors.

Good point, let me remove the warning from patch 3, then.