From patchwork Thu Feb 28 14:50:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 159379 Delivered-To: patch@linaro.org Received: by 2002:a02:5cc1:0:0:0:0:0 with SMTP id w62csp731286jad; Thu, 28 Feb 2019 06:51:33 -0800 (PST) X-Google-Smtp-Source: AHgI3IYO+q13PMTwfPyNo7/NKDatWm+cZ3f7cii3OdI20Xjn7ti+7DNXDkqeiiWvFAa1dfpwvETL X-Received: by 2002:a63:618d:: with SMTP id v135mr8804893pgb.238.1551365493206; Thu, 28 Feb 2019 06:51:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551365493; cv=none; d=google.com; s=arc-20160816; b=VGL4yzN1hmDrgzq9xYp8t6p8UxhWAz3/w/wDTuBCjgz0b+i1J/WKOW2ibEZFRbLWEe SQYt0jR6t7q2vji+XZUTiPLcUnADt8OLpr+k/9H0Oj5MA9+x4uqvEMk27ASq0TzKIS5A DYblqtBhB8V5ZGD8u5BEdrCSEkgg+7A6Kxk9zg0JgDelrbfNhIBl/MAlkJEKg6Nb7WN0 wfQVyO5v293HxTMrITPCspUyXqryKbKlgC10DhE9ZMk3QIqwdrDDMF1lFwyYoJmUD6O6 mQgEedSUbZ4yHygwdJYVCOzgCfpKSVCsKA6vBJxlbeaaZoVCyuLZEQPLUpFQ8AgL8/25 IPHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=Lxb+tsr06ZNK9dC1Sw5H+ZqxrH/RktjcWQjQi5BviLk=; b=DtKGAZZ6vh9nLcQZe3VxrZB54hss2IeSkXFn56tNsE2/pO/IISNpZM6IJw9frNMxZp FjORsGrnSrIdtEDZtF+xfvn4UljoglVmpsixiVHsMP1b1gf7DV3wlZ0RLx/12S+f5rpd VFH1KPF7UOqoAYnMRp9ivK9KLeIo74gYV//meuUdkG1RVEgKfOc1DLzrzdBHg1QXesll FUTjchhFnZc6UYoC5pABkfaJyNw4sEJW/Z14Go1l1zYtakl9GLOXK9H9wGeIH2wplGUp mSkNvFyroJVcd7/0QIW3PRJCeb0W11Va6OR8F7gm15ecDekB1KzD2UEpoCjFbHId8hx3 mkvw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t22si16670420pgv.190.2019.02.28.06.51.32; Thu, 28 Feb 2019 06:51:33 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733164AbfB1Ova (ORCPT + 31 others); Thu, 28 Feb 2019 09:51:30 -0500 Received: from szxga06-in.huawei.com ([45.249.212.32]:46374 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731361AbfB1Ov3 (ORCPT ); Thu, 28 Feb 2019 09:51:29 -0500 Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id ADEF69C18F2B0E4D68E6; Thu, 28 Feb 2019 22:51:25 +0800 (CST) Received: from localhost.localdomain (10.67.212.75) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.408.0; Thu, 28 Feb 2019 22:51:19 +0800 From: John Garry To: , CC: , , , Xiang Chen , "John Garry" Subject: [PATCH 2/6] scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO Date: Thu, 28 Feb 2019 22:50:58 +0800 Message-ID: <1551365462-128193-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1551365462-128193-1-git-send-email-john.garry@huawei.com> References: <1551365462-128193-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.212.75] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Xiang Chen For internal IO and SMP IO, there is a time-out timer for them. In the timer handler, it checks whether IO is done according to the flag task->task_state_lock. There is an issue which may cause system suspended: internal IO or SMP IO is sent, but at that time because of hardware exception (such as inject 2Bit ECC error), so IO is not completed and also not timeout. But, at that time, the SAS controller reset occurs to recover system. It will release the resource and set the status of IO to be SAS_TASK_STATE_DONE, so when IO timeout, it will never complete the completion of IO and wait for ever. [ 729.123632] Call trace: [ 729.126791] [] __switch_to+0x94/0xa8 [ 729.133106] [] __schedule+0x1e8/0x7fc [ 729.138975] [] schedule+0x34/0x8c [ 729.144401] [] schedule_timeout+0x1d8/0x3cc [ 729.150690] [] wait_for_common+0xdc/0x1a0 [ 729.157101] [] wait_for_completion+0x28/0x34 [ 729.165973] [] hisi_sas_internal_task_abort+0x2a0/0x424 [hisi_sas_test_main] [ 729.176447] [] hisi_sas_abort_task+0x244/0x2d8 [hisi_sas_test_main] [ 729.185258] [] sas_eh_handle_sas_errors+0x1c8/0x7b8 [ 729.192391] [] sas_scsi_recover_host+0x130/0x398 [ 729.199237] [] scsi_error_handler+0x148/0x5c0 [ 729.206009] [] kthread+0x10c/0x138 [ 729.211563] [] ret_from_fork+0x10/0x18 To solve the issue, callback function task_done of those IOs need to be called when on SAS controller reset. Signed-off-by: Xiang Chen Signed-off-by: John Garry --- drivers/scsi/hisi_sas/hisi_sas_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) -- 2.17.1 diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c index 923296653ed7..dd03dcbd3786 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_main.c +++ b/drivers/scsi/hisi_sas/hisi_sas_main.c @@ -980,7 +980,8 @@ static void hisi_sas_do_release_task(struct hisi_hba *hisi_hba, struct sas_task spin_lock_irqsave(&task->task_state_lock, flags); task->task_state_flags &= ~(SAS_TASK_STATE_PENDING | SAS_TASK_AT_INITIATOR); - task->task_state_flags |= SAS_TASK_STATE_DONE; + if (!slot->is_internal && task->task_proto != SAS_PROTOCOL_SMP) + task->task_state_flags |= SAS_TASK_STATE_DONE; spin_unlock_irqrestore(&task->task_state_lock, flags); }