From patchwork Wed Mar 22 06:27:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen \(ThunderTown\)" X-Patchwork-Id: 95693 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp86205qgd; Tue, 21 Mar 2017 23:29:39 -0700 (PDT) X-Received: by 10.99.42.78 with SMTP id q75mr41909636pgq.144.1490164179617; Tue, 21 Mar 2017 23:29:39 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y18si554815pgf.390.2017.03.21.23.29.39; Tue, 21 Mar 2017 23:29:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758785AbdCVG3f (ORCPT + 16 others); Wed, 22 Mar 2017 02:29:35 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:4347 "EHLO dggrg02-dlp.huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1757828AbdCVG32 (ORCPT ); Wed, 22 Mar 2017 02:29:28 -0400 Received: from 172.30.72.54 (EHLO DGGEML404-HUB.china.huawei.com) ([172.30.72.54]) by dggrg02-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id AKH60766; Wed, 22 Mar 2017 14:29:21 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEML404-HUB.china.huawei.com (10.3.17.39) with Microsoft SMTP Server id 14.3.301.0; Wed, 22 Mar 2017 14:29:08 +0800 From: Zhen Lei To: Joerg Roedel , iommu , Robin Murphy , David Woodhouse , Sudeep Dutt , Ashutosh Dixit , linux-kernel CC: Zefan Li , Xinwei Hu , "Tianhong Ding" , Hanjun Guo , Zhen Lei Subject: [PATCH 0/7] iommu/iova: improve the allocation performance of dma64 Date: Wed, 22 Mar 2017 14:27:40 +0800 Message-ID: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A0B0206.58D219C3.08CC, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 91c26b95fc4d57b9742e25e4682dd0ca Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 64 bits devices is very common now. But currently we only defined a cached32_node to optimize the allocation performance of dma32, and I saw some dma64 drivers chose to allocate iova from dma32 space first, maybe becuase of current dma64 performance problem or some other reasons. For example:(in drivers/iommu/amd_iommu.c) static unsigned long dma_ops_alloc_iova(...... { ...... if (dma_mask > DMA_BIT_MASK(32)) pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(DMA_BIT_MASK(32))); if (!pfn) pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask)); For the details of why dma64 iova allocation performance is very bad, please refer the description of patch-5. In this patch series, I added a cached64_node to manage the dma64 iova space(iova>=4G), it takes the same effect as cached32_node(iova<4G). Below it's the performance data before and after my patch series: (before)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.2 sec 7.88 MBytes 6.48 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900 [ 5] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902 [ 4] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec (after)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332 [ 5] 0.0-10.0 sec 1.10 GBytes 939 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334 [ 4] 0.0-10.0 sec 1.10 GBytes 938 Mbits/sec Zhen Lei (7): iommu/iova: fix incorrect variable types iommu/iova: cut down judgement times iommu/iova: insert start_pfn boundary of dma32 iommu/iova: adjust __cached_rbnode_insert_update iommu/iova: to optimize the allocation performance of dma64 iommu/iova: move the caculation of pad mask out of loop iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32 drivers/iommu/amd_iommu.c | 7 +- drivers/iommu/dma-iommu.c | 22 ++---- drivers/iommu/intel-iommu.c | 11 +-- drivers/iommu/iova.c | 143 +++++++++++++++++++++------------------ drivers/misc/mic/scif/scif_rma.c | 3 +- include/linux/iova.h | 7 +- 6 files changed, 94 insertions(+), 99 deletions(-) -- 2.5.0