From patchwork Tue Mar 17 07:06:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kiyanovski, Arthur" X-Patchwork-Id: 222415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9610C10F29 for ; Tue, 17 Mar 2020 07:07:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BE9CB2071C for ; Tue, 17 Mar 2020 07:07:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="CulhYpAv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726082AbgCQHH3 (ORCPT ); Tue, 17 Mar 2020 03:07:29 -0400 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]:55167 "EHLO smtp-fw-6001.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgCQHH2 (ORCPT ); Tue, 17 Mar 2020 03:07:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1584428848; x=1615964848; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=YeuqrfyGKSqUm/b1+3cTKNV/WTze+IOk8lxi8+Y7xH8=; b=CulhYpAvzKr5Rt6yLLHUF9S32LI5CCmb4FH6MJCP6FHttg/LsnsOzuYc veAwigmgu5vN8ajOBC6zLjWTNeTmiO/vCVOCSb2Pe4n8uTdJ2nzfAjEUe 4T96dCO4qD5qWHZ4lAe6DtVnL0IJyl4oeEYjHKUlTSaypf4ma1go+SY6w 8=; IronPort-SDR: T5B8hAMG4cDLvZl5hJXODhnrSw1R6MjD3l4SYuV1C/0LTNP2Xp1I08PUyQX7a97Znj8tL4S6zh Cv1BT8G6iUCA== X-IronPort-AV: E=Sophos;i="5.70,563,1574121600"; d="scan'208";a="22788229" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-1e-62350142.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-out-6001.iad6.amazon.com with ESMTP; 17 Mar 2020 07:07:16 +0000 Received: from EX13MTAUWA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1e-62350142.us-east-1.amazon.com (Postfix) with ESMTPS id 11C1FA1F6C; Tue, 17 Mar 2020 07:07:15 +0000 (UTC) Received: from EX13D10UWA004.ant.amazon.com (10.43.160.64) by EX13MTAUWA001.ant.amazon.com (10.43.160.118) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 17 Mar 2020 07:06:59 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D10UWA004.ant.amazon.com (10.43.160.64) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 17 Mar 2020 07:06:58 +0000 Received: from HFA15-G63729NC.amazon.com (10.95.76.85) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 17 Mar 2020 07:06:54 +0000 From: To: , CC: Arthur Kiyanovski , , , , , , , , , , , , , , Subject: [PATCH V2 net 2/4] net: ena: fix request of incorrect number of IRQ vectors Date: Tue, 17 Mar 2020 09:06:40 +0200 Message-ID: <1584428802-440-3-git-send-email-akiyano@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1584428802-440-1-git-send-email-akiyano@amazon.com> References: <1584428802-440-1-git-send-email-akiyano@amazon.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Arthur Kiyanovski Bug: In short the main issue is caused by the fact that the number of queues is changed using ethtool after ena_probe() has been called and before ena_up() was executed. Here is the full scenario in detail: * ena_probe() is called when the driver is loaded, the driver is not up yet at the end of ena_probe(). * The number of queues is changed -> io_queue_count is changed as well - ena_up() is not called since the "dev_was_up" boolean in ena_update_queue_count() is false. * ena_up() is called by the kernel (it's called asynchronously some time after ena_probe()). ena_setup_io_intr() is called by ena_up() and it uses io_queue_count to get the suitable irq lines for each msix vector. The function ena_request_io_irq() is called right after that and it uses msix_vecs - This value only changes during ena_probe() and ena_restore() - to request the irq vectors. This results in "Failed to request I/O IRQ" error for i > io_queue_count. Numeric example: * After ena_probe() io_queue_count = 8, msix_vecs = 9. * The number of queues changes to 4 -> io_queue_count = 4, msix_vecs = 9. * ena_up() is executed for the first time: ** ena_setup_io_intr() inits the vectors only up to io_queue_count. ** ena_request_io_irq() calls request_irq() and fails for i = 5. How to reproduce: simply run the following commands: sudo rmmod ena && sudo insmod ena.ko; sudo ethtool -L eth1 combined 3; Fix: Use ENA_MAX_MSIX_VEC(adapter->num_io_queues + adapter->xdp_num_queues) instead of adapter->msix_vecs. We need to take XDP queues into consideration as they need to have msix vectors assigned to them as well. Note that the XDP cannot be attached before the driver is up and running but in XDP mode the issue might occur when the number of queues changes right after a reset trigger. The ENA_MAX_MSIX_VEC simply adds one to the argument since the first msix vector is reserved for management queue. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Sameeh Jubran Signed-off-by: Arthur Kiyanovski --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 39f290f4458f..836fda585391 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -2068,6 +2068,7 @@ static int ena_request_mgmnt_irq(struct ena_adapter *adapter) static int ena_request_io_irq(struct ena_adapter *adapter) { + u32 io_queue_count = adapter->num_io_queues + adapter->xdp_num_queues; unsigned long flags = 0; struct ena_irq *irq; int rc = 0, i, k; @@ -2078,7 +2079,7 @@ static int ena_request_io_irq(struct ena_adapter *adapter) return -EINVAL; } - for (i = ENA_IO_IRQ_FIRST_IDX; i < adapter->msix_vecs; i++) { + for (i = ENA_IO_IRQ_FIRST_IDX; i < ENA_MAX_MSIX_VEC(io_queue_count); i++) { irq = &adapter->irq_tbl[i]; rc = request_irq(irq->vector, irq->handler, flags, irq->name, irq->data); @@ -2119,6 +2120,7 @@ static void ena_free_mgmnt_irq(struct ena_adapter *adapter) static void ena_free_io_irq(struct ena_adapter *adapter) { + u32 io_queue_count = adapter->num_io_queues + adapter->xdp_num_queues; struct ena_irq *irq; int i; @@ -2129,7 +2131,7 @@ static void ena_free_io_irq(struct ena_adapter *adapter) } #endif /* CONFIG_RFS_ACCEL */ - for (i = ENA_IO_IRQ_FIRST_IDX; i < adapter->msix_vecs; i++) { + for (i = ENA_IO_IRQ_FIRST_IDX; i < ENA_MAX_MSIX_VEC(io_queue_count); i++) { irq = &adapter->irq_tbl[i]; irq_set_affinity_hint(irq->vector, NULL); free_irq(irq->vector, irq->data); @@ -2144,12 +2146,13 @@ static void ena_disable_msix(struct ena_adapter *adapter) static void ena_disable_io_intr_sync(struct ena_adapter *adapter) { + u32 io_queue_count = adapter->num_io_queues + adapter->xdp_num_queues; int i; if (!netif_running(adapter->netdev)) return; - for (i = ENA_IO_IRQ_FIRST_IDX; i < adapter->msix_vecs; i++) + for (i = ENA_IO_IRQ_FIRST_IDX; i < ENA_MAX_MSIX_VEC(io_queue_count); i++) synchronize_irq(adapter->irq_tbl[i].vector); } From patchwork Tue Mar 17 07:06:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kiyanovski, Arthur" X-Patchwork-Id: 222414 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0056CC10F29 for ; Tue, 17 Mar 2020 07:07:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE45E2071C for ; Tue, 17 Mar 2020 07:07:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="XD3T0aRJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726157AbgCQHHe (ORCPT ); Tue, 17 Mar 2020 03:07:34 -0400 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]:31158 "EHLO smtp-fw-9102.amazon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725783AbgCQHHe (ORCPT ); Tue, 17 Mar 2020 03:07:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1584428853; x=1615964853; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=fj7cT0ZSRv/x7oialbU0QAL0t519Tk7FW+jv5hR/uyo=; b=XD3T0aRJHYrxBdPz8YqVjJafVmKALqvOp+nUyvdJmnpeL8BOU76WhTNw OdDDsOH+Kkotx0QqadSTwhAG7noKZKFPZSnE+SLEpxaN7R+t+NJoXsC/w anwz06HRVvi4Ry8c74LzXLjzNAHKKpMYGVo2UoK5jdKs1AmJ6Aq7Iu/TR s=; IronPort-SDR: dV+sFm2hDcUC0vuNt659tr/RxCcpk6VRdFk7GaNcHP2XCxPm32o1cOw2env57nj8UfkOHlbFBU 3jjzuMYoDyPA== X-IronPort-AV: E=Sophos;i="5.70,563,1574121600"; d="scan'208";a="31591862" Received: from sea32-co-svc-lb4-vlan3.sea.corp.amazon.com (HELO email-inbound-relay-1e-57e1d233.us-east-1.amazon.com) ([10.47.23.38]) by smtp-border-fw-out-9102.sea19.amazon.com with ESMTP; 17 Mar 2020 07:07:30 +0000 Received: from EX13MTAUWA001.ant.amazon.com (iad55-ws-svc-p15-lb9-vlan3.iad.amazon.com [10.40.159.166]) by email-inbound-relay-1e-57e1d233.us-east-1.amazon.com (Postfix) with ESMTPS id D15A1140242; Tue, 17 Mar 2020 07:07:29 +0000 (UTC) Received: from EX13D10UWA004.ant.amazon.com (10.43.160.64) by EX13MTAUWA001.ant.amazon.com (10.43.160.118) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Tue, 17 Mar 2020 07:07:09 +0000 Received: from EX13MTAUWA001.ant.amazon.com (10.43.160.58) by EX13D10UWA004.ant.amazon.com (10.43.160.64) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Tue, 17 Mar 2020 07:07:08 +0000 Received: from HFA15-G63729NC.amazon.com (10.95.76.85) by mail-relay.amazon.com (10.43.160.118) with Microsoft SMTP Server id 15.0.1367.3 via Frontend Transport; Tue, 17 Mar 2020 07:07:04 +0000 From: To: , CC: Arthur Kiyanovski , , , , , , , , , , , , , , Subject: [PATCH V2 net 4/4] net: ena: fix continuous keep-alive resets Date: Tue, 17 Mar 2020 09:06:42 +0200 Message-ID: <1584428802-440-5-git-send-email-akiyano@amazon.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1584428802-440-1-git-send-email-akiyano@amazon.com> References: <1584428802-440-1-git-send-email-akiyano@amazon.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Arthur Kiyanovski last_keep_alive_jiffies is updated in probe and when a keep-alive event is received. In case the driver times-out on a keep-alive event, it has high chances of continuously timing-out on keep-alive events. This is because when the driver recovers from the keep-alive-timeout reset the value of last_keep_alive_jiffies is very old, and if a keep-alive event is not received before the next timer expires, the value of last_keep_alive_jiffies will cause another keep-alive-timeout reset and so forth in a loop. Solution: Update last_keep_alive_jiffies whenever the device is restored after reset. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Noam Dagan Signed-off-by: Arthur Kiyanovski --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 51333a05c14d..4647d7656761 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -3486,6 +3486,7 @@ static int ena_restore_device(struct ena_adapter *adapter) netif_carrier_on(adapter->netdev); mod_timer(&adapter->timer_service, round_jiffies(jiffies + HZ)); + adapter->last_keep_alive_jiffies = jiffies; dev_err(&pdev->dev, "Device reset completed successfully, Driver info: %s\n", version);