From patchwork Sun Aug 19 07:51:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen \(ThunderTown\)" X-Patchwork-Id: 144526 Delivered-To: patch@linaro.org Received: by 2002:a2e:9754:0:0:0:0:0 with SMTP id f20-v6csp2665880ljj; Sun, 19 Aug 2018 00:55:38 -0700 (PDT) X-Google-Smtp-Source: AA+uWPxPXQyGvRBqG1v93xo/KnvxkqkAQXEMo+jbrQbyHP3NfVsjaz8S8z+Pw1/Q9Sra2ZfVAV4K X-Received: by 2002:a17:902:7c0a:: with SMTP id x10-v6mr40117895pll.77.1534665338276; Sun, 19 Aug 2018 00:55:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534665338; cv=none; d=google.com; s=arc-20160816; b=N6DTBiZX0UQhZFYvBvT/EyeUMX9jEIhDLofZtHvGfYe5A0wBeI3dDv8K/70nyoKBP0 Jou8Wbqmz4o4W1QIvFir6J1rp0K9/PJr9/JDdCcad2dTpEBard0oE9dA30iZ4Oc9ncuJ fIPcOtkCfFK35QLl94nZ3NDtGy3+pUrIuqAsq8tkNVw5SX3hcYZjn5fLMiqCxbhxzufU ESjG4n+gs8jvWecEaiZay8YO+5JG3MztfrqCKRsOMSwQZMTq66RpvyI2Mcdr1xsmPPzY dVtA961XqvmmlOdv3g44vYtcbjISviCSYiNeTZnOoN5CDkxCjebQ2UMvoOYPhXF4WqL4 TDZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:subject:cc :to:from:arc-authentication-results; bh=c0W1tcKLnq102DGS3QTKAaOOIb35CSVIfaKnlXzML9Q=; b=PtrwudMZrKwjyNdmH+zSoLrFJqfuNptvJVVre6JHxjWFgRdTKGXqGQqxD2XEnyktc2 GI1TEOB/SsfPHd4p8dDTzezCv8pw/7byVTOTEWiSc8S9IlgOWYRVohKYG3BM9ldzMsfd FJ6ARll6qdnigDYRVXqwpHdSNZb0ikrgdFk8dDP3mi85t3ZvgpsDb/mpSrmPFkcsP5wS fez6nfK72lRpxa5YZRc5jQMnc/jrlDj20TYo+SmkcW0NW3gIC/W56bpZAJdIcW7t7LAQ bAVem/2cP67etjHs5skFf0i7PeXkM6HABV2+zO3xwmHYxLNiaXfmfo7YEYRsPa3lgFfA rblg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q19-v6si2365521pls.490.2018.08.19.00.55.38; Sun, 19 Aug 2018 00:55:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726473AbeHSLGD (ORCPT + 32 others); Sun, 19 Aug 2018 07:06:03 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:11149 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725885AbeHSLGD (ORCPT ); Sun, 19 Aug 2018 07:06:03 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 6DD2F3357094C; Sun, 19 Aug 2018 15:55:20 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.399.0; Sun, 19 Aug 2018 15:55:13 +0800 From: Zhen Lei To: Robin Murphy , Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , linux-kernel CC: Zhen Lei , LinuxArm , Hanjun Guo , Libin , "John Garry" Subject: [PATCH v4 0/2] bugfix and optimization about CMD_SYNC Date: Sun, 19 Aug 2018 15:51:09 +0800 Message-ID: <1534665071-7976-1-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v3->v4: 1. create a new function arm_smmu_cmdq_build_sync_msi_cmd, it's only used to build CMD_SYNC for CS=SIG_IRQ mode. 2. In order to observe the optimization effect, I conducted 5 tests for each case. Although the test result is volatility, but we can still get which case is good or bad. Test command: fio -numjobs=8 -rw=randread -runtime=30 ... -bs=4k Test Result: IOPS Case 1: (without these patches) 675480 672055 665275 648610 661146 Case 2: (only apply the variant of patch 1, move arm_smmu_cmdq_build_cmd into lock) 688714 697355 632951 700540 678459 Case 3: (only apply patch 1) 721582 729226 689574 679710 727770 Case 4: (apply both patch 1 and patch 2) 734077 742868 738194 682544 740586 v2 -> v3: Although I have no data to show how many performance will be impacted because of arm_smmu_cmdq_build_cmd is protected by spinlock. But it's clear that the performance is bound to drop, a memset operation and a complicate switch..case in the function arm_smmu_cmdq_build_cmd. v1 -> v2: 1. move the call to arm_smmu_cmdq_build_cmd into the critical section, and keep itself unchange. 2. Although patch2 can make sure no two CMD_SYNCs will be adjacent, but patch1 is still needed, see below: cpu0 cpu1 cpu2 msidata=0 msidata=1 insert cmd1 insert a TLBI command insert cmd0 smmu execute cmd1 smmu execute TLBI smmu execute cmd0 poll timeout, because msidata=1 is overridden by cmd0, that means VAL=0, sync_idx=1. Zhen Lei (2): iommu/arm-smmu-v3: fix unexpected CMD_SYNC timeout iommu/arm-smmu-v3: avoid redundant CMD_SYNCs if possible drivers/iommu/arm-smmu-v3.c | 44 ++++++++++++++++++++++++++++++++------------ 1 file changed, 32 insertions(+), 12 deletions(-) -- 1.8.3