From patchwork Wed Apr 12 17:36:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 97318 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp371894qgf; Wed, 12 Apr 2017 10:36:56 -0700 (PDT) X-Received: by 10.98.201.212 with SMTP id l81mr66954851pfk.13.1492018616803; Wed, 12 Apr 2017 10:36:56 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i63si21086788pgc.265.2017.04.12.10.36.56; Wed, 12 Apr 2017 10:36:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754249AbdDLRgz (ORCPT + 6 others); Wed, 12 Apr 2017 13:36:55 -0400 Received: from mail-pf0-f174.google.com ([209.85.192.174]:36499 "EHLO mail-pf0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753876AbdDLRgy (ORCPT ); Wed, 12 Apr 2017 13:36:54 -0400 Received: by mail-pf0-f174.google.com with SMTP id o126so16900778pfb.3 for ; Wed, 12 Apr 2017 10:36:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TLGdhXuIEo8FhIVFkHw19F0Hfo16ig1l3XzWIZZchGA=; b=AE5ZRl6tS102Bvs/VXgub9GwpQ8nCHysGAPvucri5c54Gw+1UyXDFZbVeIYoPhAiGN e1Ltw42jw4Z31ka60SH1goGGLqLtpjsbOYDMeKvnPixyQrEmDcS6N8uiD19lSJ9Umk+X HZryZk/ZcvjZ0YyBJO2H5hFpOktieU1AJPkUk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TLGdhXuIEo8FhIVFkHw19F0Hfo16ig1l3XzWIZZchGA=; b=nsICR+WWl99gtCiI0cFlmUTnumbxsSGPFjiTTRFeLHp99WLyaC5PjVNJm5wdlPSG4K qLmQIosZcDqa4zcYE9qvma/Qp34NLcjIdRU8tlzdXivqnbEKpXIlHb07zSmvsCyyzbaI ULcN0ePXcDL93wVefmusYa2PB3r1HqmgcWcmpuB1GZcjBySGcxOhNcudNpWsVlzkfNsy jCBl0+niBZJ10mKr0QONy1VJTqRWIiCGpHwp8EbmhjD85A/PR6lxnTybwoanzb1fb8YQ zhOryg7V92YzbImCYfPP9SQtH+rKe/AY3kGkq00ZmswS4fdGUkRfll0nDYIbQb5E1BBa kT9A== X-Gm-Message-State: AFeK/H0GOpMYg9AlhSGzhcsN5FaBPMWdjb/caVi1K91qIMEZ+pQVzfC+KSb+FJi9eWEXujBc X-Received: by 10.99.138.68 with SMTP id y65mr69527594pgd.73.1492018613077; Wed, 12 Apr 2017 10:36:53 -0700 (PDT) Received: from phantom.lan ([106.51.225.38]) by smtp.gmail.com with ESMTPSA id 133sm31562648pfy.106.2017.04.12.10.36.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Apr 2017 10:36:51 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Gabriel Krisman Bertazi , Brian King , Douglas Miller , linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, Jens Axboe , Sumit Semwal Subject: [PATCH for-4.9 1/5] blk-mq: Avoid memory reclaim when remapping queues Date: Wed, 12 Apr 2017 23:06:31 +0530 Message-Id: <1492018595-13167-2-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> References: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Gabriel Krisman Bertazi [ Upstream commit 36e1f3d107867b25c616c2fd294f5a1c9d4e5d09 ] While stressing memory and IO at the same time we changed SMT settings, we were able to consistently trigger deadlocks in the mm system, which froze the entire machine. I think that under memory stress conditions, the large allocations performed by blk_mq_init_rq_map may trigger a reclaim, which stalls waiting on the block layer remmaping completion, thus deadlocking the system. The trace below was collected after the machine stalled, waiting for the hotplug event completion. The simplest fix for this is to make allocations in this path non-reclaimable, with GFP_NOIO. With this patch, We couldn't hit the issue anymore. This should apply on top of Jens's for-next branch cleanly. Changes since v1: - Use GFP_NOIO instead of GFP_NOWAIT. Call Trace: [c000000f0160aaf0] [c000000f0160ab50] 0xc000000f0160ab50 (unreliable) [c000000f0160acc0] [c000000000016624] __switch_to+0x2e4/0x430 [c000000f0160ad20] [c000000000b1a880] __schedule+0x310/0x9b0 [c000000f0160ae00] [c000000000b1af68] schedule+0x48/0xc0 [c000000f0160ae30] [c000000000b1b4b0] schedule_preempt_disabled+0x20/0x30 [c000000f0160ae50] [c000000000b1d4fc] __mutex_lock_slowpath+0xec/0x1f0 [c000000f0160aed0] [c000000000b1d678] mutex_lock+0x78/0xa0 [c000000f0160af00] [d000000019413cac] xfs_reclaim_inodes_ag+0x33c/0x380 [xfs] [c000000f0160b0b0] [d000000019415164] xfs_reclaim_inodes_nr+0x54/0x70 [xfs] [c000000f0160b0f0] [d0000000194297f8] xfs_fs_free_cached_objects+0x38/0x60 [xfs] [c000000f0160b120] [c0000000003172c8] super_cache_scan+0x1f8/0x210 [c000000f0160b190] [c00000000026301c] shrink_slab.part.13+0x21c/0x4c0 [c000000f0160b2d0] [c000000000268088] shrink_zone+0x2d8/0x3c0 [c000000f0160b380] [c00000000026834c] do_try_to_free_pages+0x1dc/0x520 [c000000f0160b450] [c00000000026876c] try_to_free_pages+0xdc/0x250 [c000000f0160b4e0] [c000000000251978] __alloc_pages_nodemask+0x868/0x10d0 [c000000f0160b6f0] [c000000000567030] blk_mq_init_rq_map+0x160/0x380 [c000000f0160b7a0] [c00000000056758c] blk_mq_map_swqueue+0x33c/0x360 [c000000f0160b820] [c000000000567904] blk_mq_queue_reinit+0x64/0xb0 [c000000f0160b850] [c00000000056a16c] blk_mq_queue_reinit_notify+0x19c/0x250 [c000000f0160b8a0] [c0000000000f5d38] notifier_call_chain+0x98/0x100 [c000000f0160b8f0] [c0000000000c5fb0] __cpu_notify+0x70/0xe0 [c000000f0160b930] [c0000000000c63c4] notify_prepare+0x44/0xb0 [c000000f0160b9b0] [c0000000000c52f4] cpuhp_invoke_callback+0x84/0x250 [c000000f0160ba10] [c0000000000c570c] cpuhp_up_callbacks+0x5c/0x120 [c000000f0160ba60] [c0000000000c7cb8] _cpu_up+0xf8/0x1d0 [c000000f0160bac0] [c0000000000c7eb0] do_cpu_up+0x120/0x150 [c000000f0160bb40] [c0000000006fe024] cpu_subsys_online+0x64/0xe0 [c000000f0160bb90] [c0000000006f5124] device_online+0xb4/0x120 [c000000f0160bbd0] [c0000000006f5244] online_store+0xb4/0xc0 [c000000f0160bc20] [c0000000006f0a68] dev_attr_store+0x68/0xa0 [c000000f0160bc60] [c0000000003ccc30] sysfs_kf_write+0x80/0xb0 [c000000f0160bca0] [c0000000003cbabc] kernfs_fop_write+0x17c/0x250 [c000000f0160bcf0] [c00000000030fe6c] __vfs_write+0x6c/0x1e0 [c000000f0160bd90] [c000000000311490] vfs_write+0xd0/0x270 [c000000f0160bde0] [c0000000003131fc] SyS_write+0x6c/0x110 [c000000f0160be30] [c000000000009204] system_call+0x38/0xec Signed-off-by: Gabriel Krisman Bertazi Cc: Brian King Cc: Douglas Miller Cc: linux-block@vger.kernel.org Cc: linux-scsi@vger.kernel.org Signed-off-by: Jens Axboe Signed-off-by: Sumit Semwal --- block/blk-mq.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.7.4 diff --git a/block/blk-mq.c b/block/blk-mq.c index ee54ad0..7b597ec 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1474,7 +1474,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set, INIT_LIST_HEAD(&tags->page_list); tags->rqs = kzalloc_node(set->queue_depth * sizeof(struct request *), - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY, + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, set->numa_node); if (!tags->rqs) { blk_mq_free_tags(tags); @@ -1500,7 +1500,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set, do { page = alloc_pages_node(set->numa_node, - GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO, + GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO, this_order); if (page) break; @@ -1521,7 +1521,7 @@ static struct blk_mq_tags *blk_mq_init_rq_map(struct blk_mq_tag_set *set, * Allow kmemleak to scan these pages as they contain pointers * to additional allocations like via ops->init_request(). */ - kmemleak_alloc(p, order_to_size(this_order), 1, GFP_KERNEL); + kmemleak_alloc(p, order_to_size(this_order), 1, GFP_NOIO); entries_per_page = order_to_size(this_order) / rq_size; to_do = min(entries_per_page, set->queue_depth - i); left -= to_do * rq_size; From patchwork Wed Apr 12 17:36:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 97319 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp371903qgf; Wed, 12 Apr 2017 10:36:58 -0700 (PDT) X-Received: by 10.99.113.81 with SMTP id b17mr67618555pgn.180.1492018618811; Wed, 12 Apr 2017 10:36:58 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i63si21086788pgc.265.2017.04.12.10.36.58; Wed, 12 Apr 2017 10:36:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754312AbdDLRg6 (ORCPT + 6 others); Wed, 12 Apr 2017 13:36:58 -0400 Received: from mail-pg0-f50.google.com ([74.125.83.50]:33834 "EHLO mail-pg0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753882AbdDLRg5 (ORCPT ); Wed, 12 Apr 2017 13:36:57 -0400 Received: by mail-pg0-f50.google.com with SMTP id 21so18077944pgg.1 for ; Wed, 12 Apr 2017 10:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=jQGAjIU16sv/rQ5vQRUxd/WwGo1tNDMQNm29XIHqzW4=; b=SsClJJOnm9E3qXTYYEysX4ADohktc2aUuU9IoUWlHmGbohtllWl/1zb1/EPgSLWyTw Ict/FNEsKtnFXfKXtFT8E3mpbRngwy4i3+kWdzItPOJNQafV9D3jYDQAl5qP8jR6RA0k R4gXmJdkDiINsoxL9o/nCAc7um61ja+u1Yd74= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=jQGAjIU16sv/rQ5vQRUxd/WwGo1tNDMQNm29XIHqzW4=; b=WR8a15svr68d/gSpDBOcrQAvh15FiKCg+C2HBc9jOYR1mUnwC++tJJVApk2Uo1g9aY g34i3azhRQ400hKTRXU6BMH+kfNZl90+pCqJIe5BDsIaq5eIA35Pn6mtaxW7tNcwjSRT 7mooEyzSvkHDyv4c/Wx6UpTnlxfFPQezBQamUBdi6PauFjahPvw8x+SQJgI04oGVpWwx kbn7QnM6HjuEkh4HgKlUGQvA9cWnd44aWTszciIQ4GTrEfnPEH0i3eOoI1iGAxvHLdbm KaiY4iGI76AxZOp06adhjFte5bKlhyL2xC8vsvIcBFOxFQs/1kpT91BgPlQi91IreW9n +YOw== X-Gm-Message-State: AFeK/H1Cyt11kImWSBdhaxQlcXxi5Wim3nJXgGLpMFq0Dp6eQ3X7qW17eMB56YvqnHY/p/L1 X-Received: by 10.98.144.204 with SMTP id q73mr46629908pfk.179.1492018616222; Wed, 12 Apr 2017 10:36:56 -0700 (PDT) Received: from phantom.lan ([106.51.225.38]) by smtp.gmail.com with ESMTPSA id 133sm31562648pfy.106.2017.04.12.10.36.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Apr 2017 10:36:55 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Guenter Roeck , Douglas Anderson , Greg Kroah-Hartman , Sumit Semwal Subject: [PATCH for-4.9 2/5] usb: hub: Wait for connection to be reestablished after port reset Date: Wed, 12 Apr 2017 23:06:32 +0530 Message-Id: <1492018595-13167-3-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> References: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Guenter Roeck [ Upstream commit 22547c4cc4fe20698a6a85a55b8788859134b8e4 ] On a system with a defective USB device connected to an USB hub, an endless sequence of port connect events was observed. The sequence of events as observed is as follows: - Port reports connected event (port status=USB_PORT_STAT_CONNECTION). - Event handler debounces port and resets it by calling hub_port_reset(). - hub_port_reset() calls hub_port_wait_reset() to wait for the reset to complete. - The reset completes, but USB_PORT_STAT_CONNECTION is not immediately set in the port status register. - hub_port_wait_reset() returns -ENOTCONN. - Port initialization sequence is aborted. - A few milliseconds later, the port again reports a connected event, and the sequence repeats. This continues either forever or, randomly, stops if the connection is already re-established when the port status is read. It results in a high rate of udev events. This in turn destabilizes userspace since the above sequence holds the device mutex pretty much continuously and prevents userspace from actually reading the device status. To prevent the problem from happening, let's wait for the connection to be re-established after a port reset. If the device was actually disconnected, the code will still return an error, but it will do so only after the long reset timeout. Cc: Douglas Anderson Signed-off-by: Guenter Roeck Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sumit Semwal --- drivers/usb/core/hub.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) -- 2.7.4 diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c index c28ccf1..35fb2bef 100644 --- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -2650,8 +2650,15 @@ static int hub_port_wait_reset(struct usb_hub *hub, int port1, if (ret < 0) return ret; - /* The port state is unknown until the reset completes. */ - if (!(portstatus & USB_PORT_STAT_RESET)) + /* + * The port state is unknown until the reset completes. + * + * On top of that, some chips may require additional time + * to re-establish a connection after the reset is complete, + * so also wait for the connection to be re-established. + */ + if (!(portstatus & USB_PORT_STAT_RESET) && + (portstatus & USB_PORT_STAT_CONNECTION)) break; /* switch to the long delay after two short delay failures */ From patchwork Wed Apr 12 17:36:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 97320 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp371917qgf; Wed, 12 Apr 2017 10:37:01 -0700 (PDT) X-Received: by 10.84.199.170 with SMTP id r39mr83855563pld.144.1492018621643; Wed, 12 Apr 2017 10:37:01 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i63si21086788pgc.265.2017.04.12.10.37.01; Wed, 12 Apr 2017 10:37:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754496AbdDLRhA (ORCPT + 6 others); Wed, 12 Apr 2017 13:37:00 -0400 Received: from mail-pg0-f53.google.com ([74.125.83.53]:33848 "EHLO mail-pg0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754490AbdDLRhA (ORCPT ); Wed, 12 Apr 2017 13:37:00 -0400 Received: by mail-pg0-f53.google.com with SMTP id 21so18078576pgg.1 for ; Wed, 12 Apr 2017 10:37:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=n17wPTxNRCpaz9PX+cUyQhrT5erB5Z7+nCioW5jpEUs=; b=hosfbAggVQwQYGVHBQBmFkz8usQJhHU6n7HLwXFGPQENWW1zSEVdbU2Vc5cOfBDiZj cdAhEpaAXdbld2u45eY2GHSUrTqWKqgwqexmaO+7j+QAZi7l28hZHawL5SGuvewbHNBE kbtE0H6nMycjV0MVO9sVgYs4loP0y7u5vdG+o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=n17wPTxNRCpaz9PX+cUyQhrT5erB5Z7+nCioW5jpEUs=; b=TAxGEx7ymroUkwhdq/A+PRKcR1av7P4u174LUxbNBktFQZnUKMbJRxjUkz3g3DkqVZ QVmgaHWUw2OJ9aGFr/1kgFSxAGD0gEM8NwLw1nUzYOt0xZIXNoQcX4dsxROUrc8/7b0d y/y4l7ijJh8raUYQiSe3lxXbeCjYgLqwNQqZq/sr/uifIWNxYzp6naDEjRuF/eebcM4Y jBcrT1s1LP+I7hoCZPYSIF4JQMRI1obUcQfkvuiqVE/TL+Jyt0OmcmyG7iT+/IeW128V 1vcP0kbc1sjpH9EHpf/NTLHnV+HshY/eXg3mW2vs0UKT47eSfrZ63oaDAP/G6GbyXRBT Higg== X-Gm-Message-State: AFeK/H38/AQ/YbaE4uo+PtD06Dt0Q2PqZKuvHlcKKODGBpyn92TfRp3tFq2ipXbGstIIXEMC X-Received: by 10.84.137.1 with SMTP id 1mr83785192plm.68.1492018619418; Wed, 12 Apr 2017 10:36:59 -0700 (PDT) Received: from phantom.lan ([106.51.225.38]) by smtp.gmail.com with ESMTPSA id 133sm31562648pfy.106.2017.04.12.10.36.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Apr 2017 10:36:58 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Eugenia Emantayev , Tariq Toukan , "David S . Miller" , Sumit Semwal Subject: [PATCH for-4.9 3/5] net/mlx4_en: Fix bad WQE issue Date: Wed, 12 Apr 2017 23:06:33 +0530 Message-Id: <1492018595-13167-4-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> References: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Eugenia Emantayev [ Upstream commit 6496bbf0ec481966ef9ffe5b6660d8d1b55c60cc ] Single send WQE in RX buffer should be stamped with software ownership in order to prevent the flow of QP in error in FW once UPDATE_QP is called. Fixes: 9f519f68cfff ('mlx4_en: Not using Shared Receive Queues') Signed-off-by: Eugenia Emantayev Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sumit Semwal --- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) -- 2.7.4 diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index 4d3ddc2..5d48458 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -444,8 +444,14 @@ int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv) ring->cqn = priv->rx_cq[ring_ind]->mcq.cqn; ring->stride = stride; - if (ring->stride <= TXBB_SIZE) + if (ring->stride <= TXBB_SIZE) { + /* Stamp first unused send wqe */ + __be32 *ptr = (__be32 *)ring->buf; + __be32 stamp = cpu_to_be32(1 << STAMP_SHIFT); + *ptr = stamp; + /* Move pointer to start of rx section */ ring->buf += TXBB_SIZE; + } ring->log_stride = ffs(ring->stride) - 1; ring->buf_size = ring->size * ring->stride; From patchwork Wed Apr 12 17:36:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 97321 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp371933qgf; Wed, 12 Apr 2017 10:37:05 -0700 (PDT) X-Received: by 10.98.211.142 with SMTP id z14mr18282947pfk.46.1492018625227; Wed, 12 Apr 2017 10:37:05 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i63si21086788pgc.265.2017.04.12.10.37.05; Wed, 12 Apr 2017 10:37:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754511AbdDLRhE (ORCPT + 6 others); Wed, 12 Apr 2017 13:37:04 -0400 Received: from mail-pf0-f178.google.com ([209.85.192.178]:34821 "EHLO mail-pf0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754490AbdDLRhD (ORCPT ); Wed, 12 Apr 2017 13:37:03 -0400 Received: by mail-pf0-f178.google.com with SMTP id i5so16894956pfc.2 for ; Wed, 12 Apr 2017 10:37:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HN8Ce/W9bXxgkjlPC2s+oNwk9uRRDwg/kLPj8FoFmbE=; b=NLlph5mjTpTWKHrmSG+koA9kaixSUmMsRZhL+etzoEgiXyGkCH8S3KSXRM8R3PuVsz CYNNa+S34NgUxvL1jsD+DHVz7BUNKwL9chpBVdvol1ZqqbPB6rH6/flbVY/ZpTcWoFEN c6ocwDYTAjKj/PbWcKpB1uKxeuIENFJpJH3gk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HN8Ce/W9bXxgkjlPC2s+oNwk9uRRDwg/kLPj8FoFmbE=; b=FDFNyKZ+A5avSgildhFXY2cQ1HdlZDdzAKYD1bBfWYRyXl2hbESdSlKLz5CMs61d9t 1gP2o6j4pFDCHyy0EO9KHGaF35D+ZYsT2ZVNyp8pEVXy65tscbXKHeevxfwZDG1W3hb4 jOO4K8uQLIKfyIb3GRkyvllmBHowZmDoqWj42g8vx5xLrgHguJnT8fxlMf1tsPhgSXK6 ycf187BWKzL4Xqx2YOLfPEvgP9D/aU7wevJA/L0iNva47yoxGr9gUPrsrwvBNJphBOU7 vRGkOLDtthrFnd43BS7Yzq7zvRpfzkv1uxNiF0xVP+zxS6RA5hHYpyxni1bup+24E26n QDBg== X-Gm-Message-State: AFeK/H2de3CuVi8nYYlQEn3DEnIxHi7Omg3M1JYShcn6YBb0BgA8fU2P/N46QGyQF8p6PrCy X-Received: by 10.84.215.23 with SMTP id k23mr82017029pli.58.1492018622488; Wed, 12 Apr 2017 10:37:02 -0700 (PDT) Received: from phantom.lan ([106.51.225.38]) by smtp.gmail.com with ESMTPSA id 133sm31562648pfy.106.2017.04.12.10.36.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Apr 2017 10:37:01 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Jack Morgenstein , Matan Barak , Tariq Toukan , "David S . Miller" , Sumit Semwal Subject: [PATCH for-4.9 4/5] net/mlx4_core: Fix racy CQ (Completion Queue) free Date: Wed, 12 Apr 2017 23:06:34 +0530 Message-Id: <1492018595-13167-5-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> References: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jack Morgenstein [ Upstream commit 291c566a28910614ce42d0ffe82196eddd6346f4 ] In function mlx4_cq_completion() and mlx4_cq_event(), the radix_tree_lookup requires a rcu_read_lock. This is mandatory: if another core frees the CQ, it could run the radix_tree_node_rcu_free() call_rcu() callback while its being used by the radix tree lookup function. Additionally, in function mlx4_cq_event(), since we are adding the rcu lock around the radix-tree lookup, we no longer need to take the spinlock. Also, the synchronize_irq() call for the async event eliminates the need for incrementing the cq reference count in mlx4_cq_event(). Other changes: 1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock: we no longer take this spinlock in the interrupt context. The spinlock here, therefore, simply protects against different threads simultaneously invoking mlx4_cq_free() for different cq's. 2. In function mlx4_cq_free(), we move the radix tree delete to before the synchronize_irq() calls. This guarantees that we will not access this cq during any subsequent interrupts, and therefore can safely free the CQ after the synchronize_irq calls. The rcu_read_lock in the interrupt handlers only needs to protect against corrupting the radix tree; the interrupt handlers may access the cq outside the rcu_read_lock due to the synchronize_irq calls which protect against premature freeing of the cq. 3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg. 4. We leave the cq reference count mechanism in place, because it is still needed for the cq completion tasklet mechanism. Fixes: 6d90aa5cf17b ("net/mlx4_core: Make sure there are no pending async events when freeing CQ") Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein Signed-off-by: Matan Barak Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sumit Semwal --- drivers/net/ethernet/mellanox/mlx4/cq.c | 38 +++++++++++++++++---------------- 1 file changed, 20 insertions(+), 18 deletions(-) -- 2.7.4 diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c index a849da9..6b86353 100644 --- a/drivers/net/ethernet/mellanox/mlx4/cq.c +++ b/drivers/net/ethernet/mellanox/mlx4/cq.c @@ -101,13 +101,19 @@ void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn) { struct mlx4_cq *cq; + rcu_read_lock(); cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree, cqn & (dev->caps.num_cqs - 1)); + rcu_read_unlock(); + if (!cq) { mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn); return; } + /* Acessing the CQ outside of rcu_read_lock is safe, because + * the CQ is freed only after interrupt handling is completed. + */ ++cq->arm_sn; cq->comp(cq); @@ -118,23 +124,19 @@ void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type) struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table; struct mlx4_cq *cq; - spin_lock(&cq_table->lock); - + rcu_read_lock(); cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1)); - if (cq) - atomic_inc(&cq->refcount); - - spin_unlock(&cq_table->lock); + rcu_read_unlock(); if (!cq) { - mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn); + mlx4_dbg(dev, "Async event for bogus CQ %08x\n", cqn); return; } + /* Acessing the CQ outside of rcu_read_lock is safe, because + * the CQ is freed only after interrupt handling is completed. + */ cq->event(cq, event_type); - - if (atomic_dec_and_test(&cq->refcount)) - complete(&cq->free); } static int mlx4_SW2HW_CQ(struct mlx4_dev *dev, struct mlx4_cmd_mailbox *mailbox, @@ -301,9 +303,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, if (err) return err; - spin_lock_irq(&cq_table->lock); + spin_lock(&cq_table->lock); err = radix_tree_insert(&cq_table->tree, cq->cqn, cq); - spin_unlock_irq(&cq_table->lock); + spin_unlock(&cq_table->lock); if (err) goto err_icm; @@ -349,9 +351,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, return 0; err_radix: - spin_lock_irq(&cq_table->lock); + spin_lock(&cq_table->lock); radix_tree_delete(&cq_table->tree, cq->cqn); - spin_unlock_irq(&cq_table->lock); + spin_unlock(&cq_table->lock); err_icm: mlx4_cq_free_icm(dev, cq->cqn); @@ -370,15 +372,15 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) if (err) mlx4_warn(dev, "HW2SW_CQ failed (%d) for CQN %06x\n", err, cq->cqn); + spin_lock(&cq_table->lock); + radix_tree_delete(&cq_table->tree, cq->cqn); + spin_unlock(&cq_table->lock); + synchronize_irq(priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq); if (priv->eq_table.eq[MLX4_CQ_TO_EQ_VECTOR(cq->vector)].irq != priv->eq_table.eq[MLX4_EQ_ASYNC].irq) synchronize_irq(priv->eq_table.eq[MLX4_EQ_ASYNC].irq); - spin_lock_irq(&cq_table->lock); - radix_tree_delete(&cq_table->tree, cq->cqn); - spin_unlock_irq(&cq_table->lock); - if (atomic_dec_and_test(&cq->refcount)) complete(&cq->free); wait_for_completion(&cq->free); From patchwork Wed Apr 12 17:36:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 97322 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp371942qgf; Wed, 12 Apr 2017 10:37:07 -0700 (PDT) X-Received: by 10.99.160.73 with SMTP id u9mr2503033pgn.176.1492018627567; Wed, 12 Apr 2017 10:37:07 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i63si21086788pgc.265.2017.04.12.10.37.07; Wed, 12 Apr 2017 10:37:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754530AbdDLRhG (ORCPT + 6 others); Wed, 12 Apr 2017 13:37:06 -0400 Received: from mail-pf0-f171.google.com ([209.85.192.171]:34830 "EHLO mail-pf0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754490AbdDLRhG (ORCPT ); Wed, 12 Apr 2017 13:37:06 -0400 Received: by mail-pf0-f171.google.com with SMTP id i5so16895373pfc.2 for ; Wed, 12 Apr 2017 10:37:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=I6bwftd2TfZy2I8f87bVxhyBSf2PG3P1ZI6/uUGfJJw=; b=RFO9AlnJ7lgNVXYWi+6EjmDbt5NgpC7jKKGwhE1OahK3OVIeilHIdzQeeTp1jrdeI9 utfrIbTzDR9opmxLustsZZD8HFerJGqxJYR74McQfyKa0hj+7pzv+v/oAtWp0o6oddIf zl6kIlFuldTB7guNFEC3YUSwOYZKVZs3Um7AQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=I6bwftd2TfZy2I8f87bVxhyBSf2PG3P1ZI6/uUGfJJw=; b=KM0XvkLfNdRgrjSyZjF4PswJn1baHnEWdGFW8AVn6cmhiwQ1dgl4sj39n+qUoM/rV0 03g28FzJe2BWGnx6/OSN7livgDvmp+6mniTWYY7yiMzqQRvuI5IqoPhbpNkUtt3ZGc/O 2S4NV8byXgKejlxFFfvO2ib0wrz8L/lwbCJ0FOUtiT99bslPmCO8szqYSxugDQz8qWU7 s0W8flqFgl9bi/pg18m79s3iZmCf81GdkmrFFP6cBa1xvvdldiT3fKsuJOl6NbTShIhq P/hoK3D19EvvMARFQxwY5raOsRcG+SY0dBm9W452r9AZ9KG9jhqpgeqoYeWbgIzUfcQY fbdw== X-Gm-Message-State: AFeK/H2/Ix8vx3N8gAZef65fU0hAjo4ssjmcEEgoz5qczl8eCQl91D+DEJiY6U8Sdi4hzO3+ X-Received: by 10.98.156.23 with SMTP id f23mr67120351pfe.3.1492018625354; Wed, 12 Apr 2017 10:37:05 -0700 (PDT) Received: from phantom.lan ([106.51.225.38]) by smtp.gmail.com with ESMTPSA id 133sm31562648pfy.106.2017.04.12.10.37.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 12 Apr 2017 10:37:04 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Jack Morgenstein , Tariq Toukan , "David S . Miller" , Sumit Semwal Subject: [PATCH for-4.9 5/5] net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions Date: Wed, 12 Apr 2017 23:06:35 +0530 Message-Id: <1492018595-13167-6-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> References: <1492018595-13167-1-git-send-email-sumit.semwal@linaro.org> Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jack Morgenstein [ Upstream commit 7c3945bc2073554bb2ecf983e073dee686679c53 ] Save the qp context flags byte containing the flag disabling vlan stripping in the RESET to INIT qp transition, rather than in the INIT to RTR transition. Per the firmware spec, the flags in this byte are active in the RESET to INIT transition. As a result of saving the flags in the incorrect qp transition, when switching dynamically from VGT to VST and back to VGT, the vlan remained stripped (as is required for VST) and did not return to not-stripped (as is required for VGT). Fixes: f0f829bf42cd ("net/mlx4_core: Add immediate activate for VGT->VST->VGT") Signed-off-by: Jack Morgenstein Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sumit Semwal --- drivers/net/ethernet/mellanox/mlx4/resource_tracker.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- 2.7.4 diff --git a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c index c548bea..32f76bf 100644 --- a/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c +++ b/drivers/net/ethernet/mellanox/mlx4/resource_tracker.c @@ -2980,6 +2980,9 @@ int mlx4_RST2INIT_QP_wrapper(struct mlx4_dev *dev, int slave, put_res(dev, slave, srqn, RES_SRQ); qp->srq = srq; } + + /* Save param3 for dynamic changes from VST back to VGT */ + qp->param3 = qpc->param3; put_res(dev, slave, rcqn, RES_CQ); put_res(dev, slave, mtt_base, RES_MTT); res_end_move(dev, slave, RES_QP, qpn); @@ -3772,7 +3775,6 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave, int qpn = vhcr->in_modifier & 0x7fffff; struct res_qp *qp; u8 orig_sched_queue; - __be32 orig_param3 = qpc->param3; u8 orig_vlan_control = qpc->pri_path.vlan_control; u8 orig_fvl_rx = qpc->pri_path.fvl_rx; u8 orig_pri_path_fl = qpc->pri_path.fl; @@ -3814,7 +3816,6 @@ int mlx4_INIT2RTR_QP_wrapper(struct mlx4_dev *dev, int slave, */ if (!err) { qp->sched_queue = orig_sched_queue; - qp->param3 = orig_param3; qp->vlan_control = orig_vlan_control; qp->fvl_rx = orig_fvl_rx; qp->pri_path_fl = orig_pri_path_fl;