From patchwork Tue Sep 19 16:31:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Murphy X-Patchwork-Id: 113045 Delivered-To: patch@linaro.org Received: by 10.140.106.117 with SMTP id d108csp5119084qgf; Tue, 19 Sep 2017 09:32:18 -0700 (PDT) X-Google-Smtp-Source: AOwi7QAuBorAb+SwXc45nuU+7v8sLY6oFMXgaeXnxoueiRKv1++V+fTDYry7TY0vuEQzLJ4/hKY/ X-Received: by 10.84.224.134 with SMTP id s6mr1850591plj.413.1505838738323; Tue, 19 Sep 2017 09:32:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1505838738; cv=none; d=google.com; s=arc-20160816; b=EsyarG3uIyH4qyGKe2GvuiOn3Bry8/MbBsqdcMfvaVvNSJY+6MiFdnwSXkCuYBzKO9 3Q8yfZlw5XxuvvcoZ0mQ5eI6vpgLK6iAXSA37zl4xlLlmnwgkm1OOJ8k36lGG+xqGYuk 1FzNAKBd3PvNMp6AhXaxiAM7XlnIx49pOXpthqoNYe0VnbpTGDHFwdaFFZwKtdGiJVo/ 4Qcujz0sDETCoYW6JRVaNQP2IyytKQ5wmhbKdW93FswcvVS7Z6RhGD+gkM1Yr2fDb/0O Q6oSJYXU8zqRKUiBUZxGXMN23BcPRvO3YUn1kMeB+obNP6gCSMIoPHHKZb81GaonlVrX UepA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=LQYR0IvR1qi0xJildPHy9R6gAcztY4b3PGi54Cze95E=; b=fgjwCNAlGDe1N6NfcX0HvwyADO2eOB8cQOS17iuTfGpzt5iqF3YhFb2id3rQBMXmTF FwxuFwzoBXi1KK2FwH6WB2+cQ7geB7490NqXPfxMRXDpI7Uxpa1usWicOF5QHiDnb59R +AXCSn9JHCunfU0eu+ie15xpi/ZXoRfh+frmySRM6luzZDLWAR3KmFo+8Ojik51qvDG+ 1ClIRsaZYnciXLdjZiB7QD83ho0LP6jh9dRgXpSRQmzTMG4zhsjDe8XrjIxYp3SFYlPk HKe1t00DnbggHM3MWV160zNqH65T9UFIy7xk//ecGQ/Jw/IW6raisZ4RSBfMb7vJd6AN 3Y/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si6801443plg.198.2017.09.19.09.32.17; Tue, 19 Sep 2017 09:32:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751645AbdISQcP (ORCPT + 26 others); Tue, 19 Sep 2017 12:32:15 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:52928 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751542AbdISQcN (ORCPT ); Tue, 19 Sep 2017 12:32:13 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 12D9C1435; Tue, 19 Sep 2017 09:32:13 -0700 (PDT) Received: from e110467-lin.cambridge.arm.com (e110467-lin.cambridge.arm.com [10.1.210.88]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D2A803F58C; Tue, 19 Sep 2017 09:32:11 -0700 (PDT) From: Robin Murphy To: joro@8bytes.org Cc: iommu@lists.linux-foundation.org, thunder.leizhen@huawei.com, nwatters@codeaurora.org, tomasz.nowicki@caviumnetworks.com, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/6] iommu/iova: Optimise the padding calculation Date: Tue, 19 Sep 2017 17:31:53 +0100 Message-Id: <728494c4a85091828347685fece707968f522cd1.1505829018.git.robin.murphy@arm.com> X-Mailer: git-send-email 2.13.4.dirty In-Reply-To: References: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Zhen Lei The mask for calculating the padding size doesn't change, so there's no need to recalculate it every loop iteration. Furthermore, Once we've done that, it becomes clear that we don't actually need to calculate a padding size at all - by flipping the arithmetic around, we can just combine the upper limit, size, and mask directly to check against the lower limit. For an arm64 build, this alone knocks 20% off the object code size of the entire alloc_iova() function! Signed-off-by: Zhen Lei Tested-by: Ard Biesheuvel Tested-by: Zhen Lei Tested-by: Nate Watterson [rm: simplified more of the arithmetic, rewrote commit message] Signed-off-by: Robin Murphy --- v4: - Round align_mask up instead of down (oops!) - Remove redundant !curr check - Introduce new_pfn variable here to reduce churn in later patches drivers/iommu/iova.c | 42 +++++++++++++++--------------------------- 1 file changed, 15 insertions(+), 27 deletions(-) -- 2.13.4.dirty diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index f129ff4f5c89..20be9a8b3188 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -182,24 +182,17 @@ iova_insert_rbtree(struct rb_root *root, struct iova *iova, rb_insert_color(&iova->node, root); } -/* - * Computes the padding size required, to make the start address - * naturally aligned on the power-of-two order of its size - */ -static unsigned int -iova_get_pad_size(unsigned int size, unsigned int limit_pfn) -{ - return (limit_pfn - size) & (__roundup_pow_of_two(size) - 1); -} - static int __alloc_and_insert_iova_range(struct iova_domain *iovad, unsigned long size, unsigned long limit_pfn, struct iova *new, bool size_aligned) { struct rb_node *prev, *curr = NULL; unsigned long flags; - unsigned long saved_pfn; - unsigned int pad_size = 0; + unsigned long saved_pfn, new_pfn; + unsigned long align_mask = ~0UL; + + if (size_aligned) + align_mask <<= fls_long(size - 1); /* Walk the tree backwards */ spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); @@ -209,31 +202,26 @@ static int __alloc_and_insert_iova_range(struct iova_domain *iovad, while (curr) { struct iova *curr_iova = rb_entry(curr, struct iova, node); - if (limit_pfn <= curr_iova->pfn_lo) { + if (limit_pfn <= curr_iova->pfn_lo) goto move_left; - } else if (limit_pfn > curr_iova->pfn_hi) { - if (size_aligned) - pad_size = iova_get_pad_size(size, limit_pfn); - if ((curr_iova->pfn_hi + size + pad_size) < limit_pfn) - break; /* found a free slot */ - } + + if (((limit_pfn - size) & align_mask) > curr_iova->pfn_hi) + break; /* found a free slot */ + limit_pfn = curr_iova->pfn_lo; move_left: prev = curr; curr = rb_prev(curr); } - if (!curr) { - if (size_aligned) - pad_size = iova_get_pad_size(size, limit_pfn); - if ((iovad->start_pfn + size + pad_size) > limit_pfn) { - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); - return -ENOMEM; - } + new_pfn = (limit_pfn - size) & align_mask; + if (limit_pfn < size || new_pfn < iovad->start_pfn) { + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + return -ENOMEM; } /* pfn_lo will point to size aligned address if size_aligned is set */ - new->pfn_lo = limit_pfn - (size + pad_size); + new->pfn_lo = new_pfn; new->pfn_hi = new->pfn_lo + size - 1; /* If we have 'prev', it's a valid place to start the insertion. */