From patchwork Sun Aug 19 07:51:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen \(ThunderTown\)" X-Patchwork-Id: 144525 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp2665875ljj; Sun, 19 Aug 2018 00:55:38 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyLnQJ1iY3WSF6HOYVPLnBw7o19Oms3RUbIfJSxgQ//LrZiRG57BTCFKAWzTMnCdZiUvk+G X-Received: by 2002:a65:6110:: with SMTP id z16-v6mr39677059pgu.412.1534665337923; Sun, 19 Aug 2018 00:55:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534665337; cv=none; d=google.com; s=arc-20160816; b=mIpcNyKACeDkHwMJ/ZNrSf6Ywjbx+j6eXq9GItJODIoWCL8MgIP7cZzmQUDG3J27gl z8nEPXZj15PRQKa1HI/ycwnGg+veaRPgxdIIlNqCro4x1ZZzQO4J3OjN5+p5ENGYOo8p dGeljAAhBYvOa0FGCf8lC6iNUlNI2MYkW0+bniDzXMYaGnsT3LA/WcQx1G9S+elNcZpy IrIgiKW0eb+3TJ6iwWTeZm1n9W4VlAsmSymuD7jELuc8DwrRAvL38ia1Mk5vWZ5yDQuJ 0ZuD4T9C49v3nCyvfVitmk42mzfjRVoCzQyy0Ux0xfQf/2/p2/KGqg8kOfhymihpofwq jZMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=qEjRKcJKlMpw8QGvOdLO/H9PTeuQk2FnMK1YZMwYFtM=; b=aAgvnYlc+OTuaQD3Zu2MOt300f6ZdXbvEAXjn7fkNmXyqzwjlexOuoHQOn/06Pol1r QwkKYVNq53HB/eW8o/POmv7kc0e3uPeOc2oC/eU1PSTtuJw/q9YjvbJ+zTmcmXtSKnI/ SnGnOpHjQjzG0UHeNq8jCIoY6yBzIo9hK05ae9fDqwxGUIREX31ja5DfMdB6QA3Y6JyT NiXOHVWAJ/kJqxqWmkzkqJMueSQDNmjwR/4lPRxBwfJNWeRe+8xQTBsOyPleoGEulDMZ 1ihT7BTKQ2uK+vjv/ctl6jUWgducG7WaePQY4x5YB4N3idk1zVmw0q+757UFq0kmxkWU 6ObQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q19-v6si2365521pls.490.2018.08.19.00.55.37; Sun, 19 Aug 2018 00:55:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726334AbeHSLGC (ORCPT + 32 others); Sun, 19 Aug 2018 07:06:02 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:11151 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725938AbeHSLGC (ORCPT ); Sun, 19 Aug 2018 07:06:02 -0400 Received: from DGGEMS404-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 34716649A4F60; Sun, 19 Aug 2018 15:55:22 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS404-HUB.china.huawei.com (10.3.19.204) with Microsoft SMTP Server id 14.3.399.0; Sun, 19 Aug 2018 15:55:16 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel CC: Zhen Lei , LinuxArm , Hanjun Guo , Libin , "John Garry" Subject: [PATCH v4 2/2] iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible Date: Sun, 19 Aug 2018 15:51:11 +0800 Message-ID: <1534665071-7976-3-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> References: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org More than two CMD_SYNCs maybe adjacent in the command queue, and the first one has done what others want to do. Drop the redundant CMD_SYNCs can improve IO performance especially under the pressure scene. I did the statistics in my test environment, the number of CMD_SYNCs can be reduced about 1/3. See below: CMD_SYNCs reduced: 19542181 CMD_SYNCs total: 58098548 (include reduced) CMDs total: 116197099 (TLBI:SYNC about 1:1) Signed-off-by: Zhen Lei --- drivers/iommu/arm-smmu-v3.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) -- 1.8.3 diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index ac6d6df..f3a56e1 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -567,6 +567,7 @@ struct arm_smmu_device { int gerr_irq; int combined_irq; u32 sync_nr; + u8 prev_cmd_opcode; unsigned long ias; /* IPA */ unsigned long oas; /* PA */ @@ -786,6 +787,11 @@ void arm_smmu_cmdq_build_sync_msi_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[1] = ent->sync.msiaddr & CMDQ_SYNC_1_MSIADDR_MASK; } +static inline u8 arm_smmu_cmd_opcode_get(u64 *cmd) +{ + return cmd[0] & CMDQ_0_OP; +} + /* High-level queue accessors */ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) { @@ -906,6 +912,8 @@ static void arm_smmu_cmdq_insert_cmd(struct arm_smmu_device *smmu, u64 *cmd) struct arm_smmu_queue *q = &smmu->cmdq.q; bool wfe = !!(smmu->features & ARM_SMMU_FEAT_SEV); + smmu->prev_cmd_opcode = arm_smmu_cmd_opcode_get(cmd); + while (queue_insert_raw(q, cmd) == -ENOSPC) { if (queue_poll_cons(q, false, wfe)) dev_err_ratelimited(smmu->dev, "CMDQ timeout\n"); @@ -958,9 +966,17 @@ static int __arm_smmu_cmdq_issue_sync_msi(struct arm_smmu_device *smmu) }; spin_lock_irqsave(&smmu->cmdq.lock, flags); - ent.sync.msidata = ++smmu->sync_nr; - arm_smmu_cmdq_build_sync_msi_cmd(cmd, &ent); - arm_smmu_cmdq_insert_cmd(smmu, cmd); + if (smmu->prev_cmd_opcode == CMDQ_OP_CMD_SYNC) { + /* + * Previous command is CMD_SYNC also, there is no need to add + * one more. Just poll it. + */ + ent.sync.msidata = smmu->sync_nr; + } else { + ent.sync.msidata = ++smmu->sync_nr; + arm_smmu_cmdq_build_sync_msi_cmd(cmd, &ent); + arm_smmu_cmdq_insert_cmd(smmu, cmd); + } spin_unlock_irqrestore(&smmu->cmdq.lock, flags); return __arm_smmu_sync_poll_msi(smmu, ent.sync.msidata);