From patchwork Mon Jun 18 07:50:59 2012
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Hiroshi Doyu <hdoyu@nvidia.com>
X-Patchwork-Id: 9381
Return-Path: <patch+caf_=linaro-patchwork=canonical.com@linaro.org>
X-Original-To: patchwork@peony.canonical.com
Delivered-To: patchwork@peony.canonical.com
Received: from fiordland.canonical.com (fiordland.canonical.com
 [91.189.94.145])
 by peony.canonical.com (Postfix) with ESMTP id 9F7C723F19
 for <patchwork@peony.canonical.com>;
 Mon, 18 Jun 2012 07:51:23 +0000 (UTC)
Received: from mail-gh0-f180.google.com (mail-gh0-f180.google.com
 [209.85.160.180])
 by fiordland.canonical.com (Postfix) with ESMTP id 535A6A18ADB
 for <linaro-patchwork@canonical.com>;
 Mon, 18 Jun 2012 07:51:23 +0000 (UTC)
Received: by ghbz12 with SMTP id z12so3697608ghb.11
 for <linaro-patchwork@canonical.com>;
 Mon, 18 Jun 2012 00:51:22 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=google.com; s=20120113;
 h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf
 :x-pgp-universal:date:from:to:message-id:in-reply-to:references
 :x-mailer:x-nvconfidentiality:mime-version:cc:subject:x-beenthere
 :x-mailman-version:precedence:list-id:list-unsubscribe:list-archive
 :list-post:list-help:list-subscribe:content-type
 :content-transfer-encoding:sender:errors-to:x-gm-message-state;
 bh=5xFFbAl3Vj+WlA1qFxKK2A7LV2XSB3hVI4dz89TMQwc=;
 b=m3xGpW2vikvvB7X9joFWJksm6Vx3WAotftNk6O1ssKKX9Quh3xestij0vBtKxuzGNo
 B13r4aTfwj0lKM/lbWrumDjMQxPA7C297gWNv/40VyMTDDw96qwLZyg3c3a8btwbiamU
 wgJqfmvNQeVK/jKnTNAsEgT/KXFCtwBNZdGv1obM1Kv2hilr0XF7wCrAIxU0SLuLI6L9
 M2umXnSDUnH2KEq05nxWAcn7vARF/Pve5kgy/Q3oYojMIE5Xv1ZC58re/MpPjIu6I9bF
 7A6AExJ93BYeVTWjS2guaHx+ABf750klT4w0i0CrR+rs6edyjjvLjeZ6ePXDb8tQvNf9
 Vm2Q==
Received: by 10.42.89.72 with SMTP id f8mr4337042icm.33.1340005882658;
 Mon, 18 Jun 2012 00:51:22 -0700 (PDT)
X-Forwarded-To: linaro-patchwork@canonical.com
X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com
Delivered-To: patches@linaro.org
Received: by 10.231.24.148 with SMTP id v20csp52679ibb;
 Mon, 18 Jun 2012 00:51:21 -0700 (PDT)
Received: by 10.204.157.144 with SMTP id b16mr6068943bkx.12.1340005880956;
 Mon, 18 Jun 2012 00:51:20 -0700 (PDT)
Received: from mombin.canonical.com (mombin.canonical.com. [91.189.95.16])
 by mx.google.com with ESMTP id
 gz3si12075252bkc.124.2012.06.18.00.51.17; 
 Mon, 18 Jun 2012 00:51:19 -0700 (PDT)
Received-SPF: neutral (google.com: 91.189.95.16 is neither permitted nor
 denied by best guess record for domain of
 linaro-mm-sig-bounces@lists.linaro.org) client-ip=91.189.95.16; 
Authentication-Results: mx.google.com;
 spf=neutral (google.com: 91.189.95.16 is neither
 permitted nor denied by best guess record for domain of
 linaro-mm-sig-bounces@lists.linaro.org)
 smtp.mail=linaro-mm-sig-bounces@lists.linaro.org
Received: from localhost ([127.0.0.1] helo=mombin.canonical.com)
 by mombin.canonical.com with esmtp (Exim 4.71)
 (envelope-from <linaro-mm-sig-bounces@lists.linaro.org>)
 id 1SgWjt-0000bv-LC; Mon, 18 Jun 2012 07:51:09 +0000
Received: from hqemgate04.nvidia.com ([216.228.121.35])
 by mombin.canonical.com with esmtp (Exim 4.71)
 (envelope-from <hdoyu@nvidia.com>) id 1SgWjr-0000bE-Ue
 for linaro-mm-sig@lists.linaro.org; Mon, 18 Jun 2012 07:51:08 +0000
Received: from hqnvupgp08.nvidia.com (Not Verified[216.228.121.13]) by
 hqemgate04.nvidia.com
 id <B4fdeddb80000>; Mon, 18 Jun 2012 00:50:16 -0700
Received: from hqemhub01.nvidia.com ([172.17.108.22])
 by hqnvupgp08.nvidia.com (PGP Universal service);
 Mon, 18 Jun 2012 00:51:03 -0700
X-PGP-Universal: processed;
 by hqnvupgp08.nvidia.com on Mon, 18 Jun 2012 00:51:03 -0700
Received: from deemhub02.nvidia.com (10.21.69.138) by hqemhub01.nvidia.com
 (172.20.150.30) with Microsoft SMTP Server (TLS) id 8.3.264.0;
 Mon, 18 Jun 2012 00:51:03 -0700
Received: from oreo (10.21.65.27) by deemhub02.nvidia.com (10.21.69.138) with
 Microsoft SMTP Server (TLS) id 8.3.264.0;
 Mon, 18 Jun 2012 09:51:00 +0200
Received: from oreo ([::1])	by oreo with smtp (Exim 4.76)	(envelope-from
 <hdoyu@nvidia.com>)	id 1SgWjk-0005T0-3a;
 Mon, 18 Jun 2012 10:51:00 +0300
Date: Mon, 18 Jun 2012 10:50:59 +0300
From: Hiroshi Doyu <hdoyu@nvidia.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Message-ID: <20120618105059.12c709d68240ad18c5f8c7a5@nvidia.com>
In-Reply-To: <1338988657-20770-1-git-send-email-m.szyprowski@samsung.com>
References: <1338988657-20770-1-git-send-email-m.szyprowski@samsung.com>
X-Mailer: Sylpheed 3.2.0beta3 (GTK+ 2.24.6; x86_64-pc-linux-gnu)
X-NVConfidentiality: public
MIME-Version: 1.0
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
 Abhinav Kochhar <abhinav.k@samsung.com>, Russell King -
 ARM Linux <linux@arm.linux.org.uk>, Arnd Bergmann <arnd@arndb.de>,
 Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Benjamin
 Herrenschmidt <benh@kernel.crashing.org>,
 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
 Subash Patel <subash.ramaswamy@linaro.org>,
 "linaro-mm-sig@lists.linaro.org" <linaro-mm-sig@lists.linaro.org>,
 "linux-mm@kvack.org" <linux-mm@kvack.org>,
 Kyungmin Park <kyungmin.park@samsung.com>,
 "linux-arm-kernel@lists.infradead.org"
 <linux-arm-kernel@lists.infradead.org>
Subject: Re: [Linaro-mm-sig] [PATCH/RFC 0/2] ARM: DMA-mapping: new
 extensions for buffer sharing (part 2)
X-BeenThere: linaro-mm-sig@lists.linaro.org
X-Mailman-Version: 2.1.13
Precedence: list
List-Id: "Unified memory management interest group."
 <linaro-mm-sig.lists.linaro.org>
List-Unsubscribe: <http://lists.linaro.org/mailman/options/linaro-mm-sig>,
 <mailto:linaro-mm-sig-request@lists.linaro.org?subject=unsubscribe>
List-Archive: <http://lists.linaro.org/pipermail/linaro-mm-sig>
List-Post: <mailto:linaro-mm-sig@lists.linaro.org>
List-Help: <mailto:linaro-mm-sig-request@lists.linaro.org?subject=help>
List-Subscribe: <http://lists.linaro.org/mailman/listinfo/linaro-mm-sig>,
 <mailto:linaro-mm-sig-request@lists.linaro.org?subject=subscribe>
Sender: linaro-mm-sig-bounces@lists.linaro.org
Errors-To: linaro-mm-sig-bounces@lists.linaro.org
X-Gm-Message-State: ALoCoQkBGVmk0weyhkB/Tu2Ow/p5W+31vG/bf2pvKA2PpV1Mi9C1IeXb2McFIiAredmatVtQmK+b

Hi Marek,

On Wed, 6 Jun 2012 15:17:35 +0200
Marek Szyprowski <m.szyprowski@samsung.com> wrote:

> Hello,
> 
> This is a continuation of the dma-mapping extensions posted in the
> following thread:
> http://thread.gmane.org/gmane.linux.kernel.mm/78644
> 
> We noticed that some advanced buffer sharing use cases usually require
> creating a dma mapping for the same memory buffer for more than one
> device. Usually also such buffer is never touched with CPU, so the data
> are processed by the devices.
> 
> From the DMA-mapping perspective this requires to call one of the
> dma_map_{page,single,sg} function for the given memory buffer a few
> times, for each of the devices. Each dma_map_* call performs CPU cache
> synchronization, what might be a time consuming operation, especially
> when the buffers are large. We would like to avoid any useless and time
> consuming operations, so that was the main reason for introducing
> another attribute for DMA-mapping subsystem: DMA_ATTR_SKIP_CPU_SYNC,
> which lets dma-mapping core to skip CPU cache synchronization in certain
> cases.

I had implemented the similer patch(*1) to optimize/skip the cache
maintanace, but we did this with "dir", not with "attr", making use of
the existing DMA_NONE to skip cache operations. I'm just interested in
why you choose attr for this purpose. Could you enlight me why attr is
used here?

Any way, this feature is necessary for us. Thank you for posting them.

*1: FYI:

>From 4656146d23d0a3bd02131f732b0c04e50475b8da Mon Sep 17 00:00:00 2001
From: Hiroshi DOYU <hdoyu@nvidia.com>
Date: Tue, 20 Mar 2012 15:09:30 +0200
Subject: [PATCH 1/1] ARM: dma-mapping: Allow DMA_NONE to skip cache_maint

Signed-off-by: Hiroshi DOYU <hdoyu@nvidia.com>
---
 arch/arm/mm/dma-mapping.c                |   16 ++++++++--------
 drivers/video/tegra/nvmap/nvmap.c        |    2 +-
 drivers/video/tegra/nvmap/nvmap_handle.c |    2 +-
 include/linux/dma-mapping.h              |   16 +++++++++++++---
 4 files changed, 23 insertions(+), 13 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 83f0ac6..c4b1587 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1161,7 +1161,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
 		phys_addr_t phys = page_to_phys(sg_page(s));
 		unsigned int len = PAGE_ALIGN(s->offset + s->length);
 
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() && (dir != DMA_NONE))
 			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 
 		ret = iommu_map(mapping->domain, iova, phys, len, 0);
@@ -1254,7 +1254,7 @@ void arm_iommu_unmap_sg(struct device *dev, struct scatterlist *sg, int nents,
 		if (sg_dma_len(s))
 			__iommu_remove_mapping(dev, sg_dma_address(s),
 					       sg_dma_len(s));
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() && (dir != DMA_NONE))
 			__dma_page_dev_to_cpu(sg_page(s), s->offset,
 					      s->length, dir);
 	}
@@ -1274,7 +1274,7 @@ void arm_iommu_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i)
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() && (dir != DMA_NONE))
 			__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
 
 }
@@ -1293,7 +1293,7 @@ void arm_iommu_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 	int i;
 
 	for_each_sg(sg, s, nents, i)
-		if (!arch_is_coherent())
+		if (!arch_is_coherent() && (dir != DMA_NONE))
 			__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
 }
 
@@ -1305,7 +1305,7 @@ static dma_addr_t __arm_iommu_map_page_at(struct device *dev, struct page *page,
 	dma_addr_t dma_addr;
 	int ret, len = PAGE_ALIGN(size + offset);
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && (dir != DMA_NONE))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 
 	dma_addr = __alloc_iova_at(mapping, req, len);
@@ -1349,7 +1349,7 @@ dma_addr_t arm_iommu_map_page_at(struct device *dev, struct page *page,
 	unsigned int phys;
 	int ret;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && (dir != DMA_NONE))
 		__dma_page_cpu_to_dev(page, offset, size, dir);
 
 	/* Check if iova area is reserved in advance. */
@@ -1386,7 +1386,7 @@ static void __arm_iommu_unmap_page_at(struct device *dev, dma_addr_t handle,
 	if (!iova)
 		return;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && (dir != DMA_NONE))
 		__dma_page_dev_to_cpu(page, offset, size, dir);
 
 	iommu_unmap(mapping->domain, iova, len);
@@ -1430,7 +1430,7 @@ static void arm_iommu_sync_single_for_cpu(struct device *dev,
 	if (!iova)
 		return;
 
-	if (!arch_is_coherent())
+	if (!arch_is_coherent() && (dir != DMA_NONE))
 		__dma_page_dev_to_cpu(page, offset, size, dir);
 }
 
diff --git a/drivers/video/tegra/nvmap/nvmap.c b/drivers/video/tegra/nvmap/nvmap.c
index 1032224..e98dd11 100644
--- a/drivers/video/tegra/nvmap/nvmap.c
+++ b/drivers/video/tegra/nvmap/nvmap.c
@@ -56,7 +56,7 @@ static void map_iovmm_area(struct nvmap_handle *h)
 		BUG_ON(!pfn_valid(page_to_pfn(h->pgalloc.pages[i])));
 
 		iova = dma_map_page_at(to_iovmm_dev(h), h->pgalloc.pages[i],
-				       va, 0, PAGE_SIZE, DMA_BIDIRECTIONAL);
+				       va, 0, PAGE_SIZE, DMA_NONE);
 		BUG_ON(iova != va);
 	}
 	h->pgalloc.dirty = false;
diff --git a/drivers/video/tegra/nvmap/nvmap_handle.c b/drivers/video/tegra/nvmap/nvmap_handle.c
index 853f87e..b2bbeb1 100644
--- a/drivers/video/tegra/nvmap/nvmap_handle.c
+++ b/drivers/video/tegra/nvmap/nvmap_handle.c
@@ -504,7 +504,7 @@ void nvmap_free_vm(struct device *dev, struct tegra_iovmm_area *area)
 		dma_addr_t iova;
 
 		iova = area->iovm_start + i * PAGE_SIZE;
-		dma_unmap_page(dev, iova, PAGE_SIZE, DMA_BIDIRECTIONAL);
+		dma_unmap_page(dev, iova, PAGE_SIZE, DMA_NONE);
 	}
 	kfree(area);
 }
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 36dfe06..cbd8d47 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -55,9 +55,19 @@ struct dma_map_ops {
 
 static inline int valid_dma_direction(int dma_direction)
 {
-	return ((dma_direction == DMA_BIDIRECTIONAL) ||
-		(dma_direction == DMA_TO_DEVICE) ||
-		(dma_direction == DMA_FROM_DEVICE));
+	int ret = 1;
+
+	switch (dma_direction) {
+	case DMA_BIDIRECTIONAL:
+	case DMA_TO_DEVICE:
+	case DMA_FROM_DEVICE:
+	case DMA_NONE:
+		break;
+	default:
+		ret = !!ret;
+		break;
+	} 
+	return ret;
 }
 
 static inline int is_device_dma_capable(struct device *dev)