From patchwork Fri Aug 24 15:52:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 145082 Delivered-To: patch@linaro.org Received: by 2002:a2e:1648:0:0:0:0:0 with SMTP id 8-v6csp1411122ljw; Fri, 24 Aug 2018 08:53:34 -0700 (PDT) X-Google-Smtp-Source: ANB0VdY73edPjjY8z0mmAn4rMoiJYSJ1xi8ZWEUb0KrOSZ3L950NSM+I1k8W0UyJA2oYN4Kmsdi2 X-Received: by 2002:a63:125a:: with SMTP id 26-v6mr2233691pgs.210.1535126013999; Fri, 24 Aug 2018 08:53:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535126013; cv=none; d=google.com; s=arc-20160816; b=qLcx5FevyXaFz8e1IQMxUKW2B93AQ/ni8+QWHfnBBTBgaBMMtxqonpy2A/yHYyyIVb HIlTwoamNtN6qHK1nzkRhZRefkjdd5YMhCbtmFJdH4Jz0YE/EQbs4dK8KJLcmE9VO6GX HpqKSFRbR+PMT2YyBikcOtfIKOfKCALfbaU4XLHtly3Ah+BMWdCq1Ra8A0LVTVXSD7gI QHd8CuILWqB21ZdLRrWAUKuBZk9+TAue4RLEp+Wbi/vQbV6Ffb+zcrnU+TYQPwDXV13e W8qD4qcwVC3pNfJK7aMkxgQ1IXw5F3wDepJcpjg8xyPhXmYoOFv8+pApqktk3WhapKyn 1gHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :arc-authentication-results; bh=V4uGgR2DtQPxW3sdRII4G23Z0hwdEW6A6xjHrAM2am4=; b=nPZH1n0kUVkcTzhE+w4ER5xGCFuj/9MStneUeHF7t9ow3nlbNkibrzhaKosTcIyUbl /1i3fvIgaKmGbjLxfiuKYqdlULNNLlkQNhWQ+caJtV6arUg38nLByuFokToZDN09vDYV gPjdk/IkTF/+ZqI7licQqTKXgCjEV6JIAKcqU+sh4H/U9tJT9Qms6hlG/KdS81FIP9tz zz0Bg0USgDzzZUFsKI8bg4tjjG9nbBtsmuwlTXZaBw4ot/Hdah7bbrigBb1r5Zca48+V HJTYZsf3wVi3qCCBxrgDKwJKqrpTU23h5mnRt4rELqeNzV5t2j1j07n05AvLWYjDLxu0 u2zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g10-v6si6762177plt.468.2018.08.24.08.53.33; Fri, 24 Aug 2018 08:53:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728166AbeHXT2p (ORCPT + 32 others); Fri, 24 Aug 2018 15:28:45 -0400 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:32796 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726277AbeHXT1v (ORCPT ); Fri, 24 Aug 2018 15:27:51 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D1B9E80D; Fri, 24 Aug 2018 08:52:37 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id A24D73F5BC; Fri, 24 Aug 2018 08:52:37 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 9A9651AE323F; Fri, 24 Aug 2018 16:52:47 +0100 (BST) From: Will Deacon To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, benh@au1.ibm.com, torvalds@linux-foundation.org, npiggin@gmail.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org, Will Deacon Subject: [RFC PATCH 00/11] Avoid synchronous TLB invalidation for intermediate page-table entries on arm64 Date: Fri, 24 Aug 2018 16:52:35 +0100 Message-Id: <1535125966-7666-1-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, I hacked up this RFC on the back of the recent changes to the mmu_gather stuff in mainline. It's had a bit of testing and it looks pretty good so far. The main changes in the series are: - Avoid emitting a DSB barrier after clearing each page-table entry. Instead, we can have a single DSB prior to the actual TLB invalidation. - Batch last-level TLB invalidation until the end of the VMA, and use last-level-only invalidation instructions - Batch intermediate TLB invalidation until the end of the gather, and use all-level invalidation instructions - Adjust the stride of TLB invalidation based upon the smallest unflushed granule in the gather As a really stupid benchmark, unmapping a populated mapping of 0x4_3333_3000 bytes using munmap() takes around 20% of the time it took before. The core changes now track the levels of page-table that have been visited by the mmu_gather since the last flush. It may be possible to use the page_size field instead if we decide to resurrect that from its current "debug" status, but I think I'd prefer to track the levels explicitly. Anyway, I wanted to post this before disappearing for the long weekend (Monday is a holiday in the UK) in the hope that it helps some of the ongoing discussions. Cheers, Will --->8 Peter Zijlstra (1): asm-generic/tlb: Track freeing of page-table directories in struct mmu_gather Will Deacon (10): arm64: tlb: Use last-level invalidation in flush_tlb_kernel_range() arm64: tlb: Add DSB ISHST prior to TLBI in __flush_tlb_[kernel_]pgtable() arm64: pgtable: Implement p[mu]d_valid() and check in set_p[mu]d() arm64: tlb: Justify non-leaf invalidation in flush_tlb_range() arm64: tlbflush: Allow stride to be specified for __flush_tlb_range() arm64: tlb: Remove redundant !CONFIG_HAVE_RCU_TABLE_FREE code asm-generic/tlb: Guard with #ifdef CONFIG_MMU asm-generic/tlb: Track which levels of the page tables have been cleared arm64: tlb: Adjust stride and type of TLBI according to mmu_gather arm64: tlb: Avoid synchronous TLBIs when freeing page tables arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable.h | 10 ++++- arch/arm64/include/asm/tlb.h | 34 +++++++---------- arch/arm64/include/asm/tlbflush.h | 28 +++++++------- include/asm-generic/tlb.h | 79 +++++++++++++++++++++++++++++++++------ mm/memory.c | 4 +- 6 files changed, 105 insertions(+), 51 deletions(-) -- 2.1.4