From patchwork Fri Mar 31 03:24:18 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Leizhen \(ThunderTown\)" X-Patchwork-Id: 96340 Delivered-To: patch@linaro.org Received: by 10.140.89.233 with SMTP id v96csp537264qgd; Thu, 30 Mar 2017 20:27:49 -0700 (PDT) X-Received: by 10.99.163.91 with SMTP id v27mr1096227pgn.171.1490930869221; Thu, 30 Mar 2017 20:27:49 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s22si3708392plk.156.2017.03.30.20.27.48; Thu, 30 Mar 2017 20:27:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932589AbdCaD00 (ORCPT + 20 others); Thu, 30 Mar 2017 23:26:26 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:4835 "EHLO dggrg01-dlp.huawei.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1754017AbdCaD0Y (ORCPT ); Thu, 30 Mar 2017 23:26:24 -0400 Received: from 172.30.72.54 (EHLO DGGEML401-HUB.china.huawei.com) ([172.30.72.54]) by dggrg01-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ALU41350; Fri, 31 Mar 2017 11:26:18 +0800 (CST) Received: from localhost (10.177.23.164) by DGGEML401-HUB.china.huawei.com (10.3.17.32) with Microsoft SMTP Server id 14.3.301.0; Fri, 31 Mar 2017 11:26:07 +0800 From: Zhen Lei To: Joerg Roedel , iommu , Robin Murphy , David Woodhouse , Sudeep Dutt , Ashutosh Dixit , linux-kernel CC: Zefan Li , Xinwei Hu , "Tianhong Ding" , Hanjun Guo , Zhen Lei Subject: [PATCH v2 0/7] iommu/iova: improve the allocation performance of dma64 Date: Fri, 31 Mar 2017 11:24:18 +0800 Message-ID: <1490930665-9696-1-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 MIME-Version: 1.0 X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020205.58DDCC5A.010B, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: e16474da6bf83e304ef352e7a79254b5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org v1 -> v2: Because the problem of my email-server, all patches sent to Joerg Roedel failed. So I repost all these patches again, there is no changes. v1: 64 bits devices is very common now. But currently we only defined a cached32_node to optimize the allocation performance of dma32, and I saw some dma64 drivers chose to allocate iova from dma32 space first, maybe becuase of current dma64 performance problem or some other reasons. For example:(in drivers/iommu/amd_iommu.c) static unsigned long dma_ops_alloc_iova(...... { ...... if (dma_mask > DMA_BIT_MASK(32)) pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(DMA_BIT_MASK(32))); if (!pfn) pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask)); For the details of why dma64 iova allocation performance is very bad, please refer the description of patch-5. In this patch series, I added a cached64_node to manage the dma64 iova space(iova>=4G), it takes the same effect as cached32_node(iova<4G). Below it's the performance data before and after my patch series: (before)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.2 sec 7.88 MBytes 6.48 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900 [ 5] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902 [ 4] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec (after)$ iperf -s ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec [ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332 [ 5] 0.0-10.0 sec 1.10 GBytes 939 Mbits/sec [ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334 [ 4] 0.0-10.0 sec 1.10 GBytes 938 Mbits/sec Zhen Lei (7): iommu/iova: fix incorrect variable types iommu/iova: cut down judgement times iommu/iova: insert start_pfn boundary of dma32 iommu/iova: adjust __cached_rbnode_insert_update iommu/iova: to optimize the allocation performance of dma64 iommu/iova: move the caculation of pad mask out of loop iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32 drivers/iommu/amd_iommu.c | 7 +- drivers/iommu/dma-iommu.c | 22 ++---- drivers/iommu/intel-iommu.c | 11 +-- drivers/iommu/iova.c | 143 +++++++++++++++++++++------------------ drivers/misc/mic/scif/scif_rma.c | 3 +- include/linux/iova.h | 7 +- 6 files changed, 94 insertions(+), 99 deletions(-) -- 2.5.0