From patchwork Fri Jan 23 10:02:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christoffer Dall X-Patchwork-Id: 43582 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ee0-f71.google.com (mail-ee0-f71.google.com [74.125.83.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 23CC4218DB for ; Fri, 23 Jan 2015 10:12:43 +0000 (UTC) Received: by mail-ee0-f71.google.com with SMTP id d49sf4298966eek.2 for ; Fri, 23 Jan 2015 02:12:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id :in-reply-to:references:cc:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:mime-version :content-type:content-transfer-encoding:sender:errors-to :x-original-sender:x-original-authentication-results:mailing-list; bh=7JHiz4zmyj3RRgkk0k4J7uciIIIuekhPMaS3FQBjfHM=; b=LSgMqUtRVtcbqsLgnznUo3QywQNBpr5R98AsTFc4LCCcUvfdSVHBdHl1OykGuCzBtN HctHakMCABtz1sPWTpZtWey0edIUocoVMCpg+EHiZxdMGokypt65eKWYZN3wyJYW4GYa Y2xsHa68F25jO4jhwBZ/IShiZJX0GCrjYPIv0XPLMsk36HJ9ks70V3WI+S0heb4LoenA 1mq5frVH7Guf7z8VVEkv3ayvLnc9VC9bRbt6VPvgdLmKgGZykFANJvfJtfzLTAHSrou7 gy/Zw39Qu5NkfHt/Y/ATLbjxiacz/9D0esXOCOpNp4y17wLXlu8StvU+gJGwKmzN/XfL 7eeA== X-Gm-Message-State: ALoCoQkS1Wsk8KYFJ6c6uq9V/0iCk/ZdraGIQYKBhxEetqYBdMW4i5NbnDy1TkT9SH1bPtiiB0AT X-Received: by 10.152.121.65 with SMTP id li1mr873892lab.0.1422007962340; Fri, 23 Jan 2015 02:12:42 -0800 (PST) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.87.84 with SMTP id v20ls184222laz.57.gmail; Fri, 23 Jan 2015 02:12:42 -0800 (PST) X-Received: by 10.112.99.71 with SMTP id eo7mr6528471lbb.26.1422007962197; Fri, 23 Jan 2015 02:12:42 -0800 (PST) Received: from mail-la0-f41.google.com (mail-la0-f41.google.com. [209.85.215.41]) by mx.google.com with ESMTPS id pg4si940497lbb.55.2015.01.23.02.12.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 23 Jan 2015 02:12:42 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.41 as permitted sender) client-ip=209.85.215.41; Received: by mail-la0-f41.google.com with SMTP id gm9so6474294lab.0 for ; Fri, 23 Jan 2015 02:12:42 -0800 (PST) X-Received: by 10.152.3.70 with SMTP id a6mr6349124laa.71.1422007962086; Fri, 23 Jan 2015 02:12:42 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.9.200 with SMTP id c8csp195708lbb; Fri, 23 Jan 2015 02:12:41 -0800 (PST) X-Received: by 10.66.140.102 with SMTP id rf6mr9535187pab.147.1422007959746; Fri, 23 Jan 2015 02:12:39 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id vx6si1334600pac.141.2015.01.23.02.12.38 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 23 Jan 2015 02:12:39 -0800 (PST) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YEbCK-0002Tn-9C; Fri, 23 Jan 2015 10:10:40 +0000 Received: from mail-lb0-f181.google.com ([209.85.217.181]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YEb5a-0003rR-Vf for linux-arm-kernel@lists.infradead.org; Fri, 23 Jan 2015 10:03:44 +0000 Received: by mail-lb0-f181.google.com with SMTP id u10so6139545lbd.12 for ; Fri, 23 Jan 2015 02:03:24 -0800 (PST) X-Received: by 10.112.148.73 with SMTP id tq9mr6296828lbb.37.1422007404584; Fri, 23 Jan 2015 02:03:24 -0800 (PST) Received: from localhost.localdomain (188-178-240-98-static.dk.customer.tdc.net. [188.178.240.98]) by mx.google.com with ESMTPSA id pg3sm331848lbb.8.2015.01.23.02.03.22 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 23 Jan 2015 02:03:23 -0800 (PST) From: Christoffer Dall To: Paolo Bonzini , kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: [GIT PULL 11/36] KVM: arm: page logging 2nd stage fault handling Date: Fri, 23 Jan 2015 11:02:40 +0100 Message-Id: <1422007385-14730-12-git-send-email-christoffer.dall@linaro.org> X-Mailer: git-send-email 2.1.2.330.g565301e.dirty In-Reply-To: <1422007385-14730-1-git-send-email-christoffer.dall@linaro.org> References: <1422007385-14730-1-git-send-email-christoffer.dall@linaro.org> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150123_020343_262200_996787A5 X-CRM114-Status: GOOD ( 17.32 ) X-Spam-Score: -0.7 (/) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-0.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.217.181 listed in wl.mailspike.net] -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.217.181 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Marc Zyngier , kvm@vger.kernel.org, Mario Smarduch X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: christoffer.dall@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.41 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 From: Mario Smarduch This patch adds support for 2nd stage page fault handling while dirty page logging. On huge page faults, huge pages are dissolved to normal pages, and rebuilding of 2nd stage huge pages is blocked. In case migration is canceled this restriction is removed and huge pages may be rebuilt again. Signed-off-by: Mario Smarduch Reviewed-by: Christoffer Dall --- arch/arm/kvm/mmu.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 88 insertions(+), 9 deletions(-) diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c index 565e903..6685c68 100644 --- a/arch/arm/kvm/mmu.c +++ b/arch/arm/kvm/mmu.c @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector; #define kvm_pmd_huge(_x) (pmd_huge(_x) || pmd_trans_huge(_x)) #define kvm_pud_huge(_x) pud_huge(_x) +#define KVM_S2PTE_FLAG_IS_IOMAP (1UL << 0) +#define KVM_S2_FLAG_LOGGING_ACTIVE (1UL << 1) + +static bool memslot_is_logging(struct kvm_memory_slot *memslot) +{ +#ifdef CONFIG_ARM + return memslot->dirty_bitmap && !(memslot->flags & KVM_MEM_READONLY); +#else + return false; +#endif +} + static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) { /* @@ -59,6 +71,25 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa) kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa); } +/** + * stage2_dissolve_pmd() - clear and flush huge PMD entry + * @kvm: pointer to kvm structure. + * @addr: IPA + * @pmd: pmd pointer for IPA + * + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all + * pages in the range dirty. + */ +static void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd) +{ + if (!kvm_pmd_huge(*pmd)) + return; + + pmd_clear(pmd); + kvm_tlb_flush_vmid_ipa(kvm, addr); + put_page(virt_to_page(pmd)); +} + static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache, int min, int max) { @@ -768,10 +799,15 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache } static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, - phys_addr_t addr, const pte_t *new_pte, bool iomap) + phys_addr_t addr, const pte_t *new_pte, + unsigned long flags) { pmd_t *pmd; pte_t *pte, old_pte; + bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP; + bool logging_active = flags & KVM_S2_FLAG_LOGGING_ACTIVE; + + VM_BUG_ON(logging_active && !cache); /* Create stage-2 page table mapping - Levels 0 and 1 */ pmd = stage2_get_pmd(kvm, cache, addr); @@ -783,6 +819,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache, return 0; } + /* + * While dirty page logging - dissolve huge PMD, then continue on to + * allocate page. + */ + if (logging_active) + stage2_dissolve_pmd(kvm, addr, pmd); + /* Create stage-2 page mappings - Level 2 */ if (pmd_none(*pmd)) { if (!cache) @@ -839,7 +882,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa, if (ret) goto out; spin_lock(&kvm->mmu_lock); - ret = stage2_set_pte(kvm, &cache, addr, &pte, true); + ret = stage2_set_pte(kvm, &cache, addr, &pte, + KVM_S2PTE_FLAG_IS_IOMAP); spin_unlock(&kvm->mmu_lock); if (ret) goto out; @@ -1067,6 +1111,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, pfn_t pfn; pgprot_t mem_type = PAGE_S2; bool fault_ipa_uncached; + bool logging_active = memslot_is_logging(memslot); + unsigned long flags = 0; write_fault = kvm_is_write_fault(vcpu); if (fault_status == FSC_PERM && !write_fault) { @@ -1083,7 +1129,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return -EFAULT; } - if (is_vm_hugetlb_page(vma)) { + if (is_vm_hugetlb_page(vma) && !logging_active) { hugetlb = true; gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT; } else { @@ -1124,12 +1170,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, if (is_error_pfn(pfn)) return -EFAULT; - if (kvm_is_device_pfn(pfn)) + if (kvm_is_device_pfn(pfn)) { mem_type = PAGE_S2_DEVICE; + flags |= KVM_S2PTE_FLAG_IS_IOMAP; + } else if (logging_active) { + /* + * Faults on pages in a memslot with logging enabled + * should not be mapped with huge pages (it introduces churn + * and performance degradation), so force a pte mapping. + */ + force_pte = true; + flags |= KVM_S2_FLAG_LOGGING_ACTIVE; + + /* + * Only actually map the page as writable if this was a write + * fault. + */ + if (!write_fault) + writable = false; + } spin_lock(&kvm->mmu_lock); if (mmu_notifier_retry(kvm, mmu_seq)) goto out_unlock; + if (!hugetlb && !force_pte) hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa); @@ -1147,17 +1211,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd); } else { pte_t new_pte = pfn_pte(pfn, mem_type); + if (writable) { kvm_set_s2pte_writable(&new_pte); kvm_set_pfn_dirty(pfn); + mark_page_dirty(kvm, gfn); } coherent_cache_guest_page(vcpu, hva, PAGE_SIZE, fault_ipa_uncached); - ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, - pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE)); + ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags); } - out_unlock: spin_unlock(&kvm->mmu_lock); kvm_release_pfn_clean(pfn); @@ -1307,7 +1371,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data) { pte_t *pte = (pte_t *)data; - stage2_set_pte(kvm, NULL, gpa, pte, false); + /* + * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE + * flag clear because MMU notifiers will have unmapped a huge PMD before + * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and + * therefore stage2_set_pte() never needs to clear out a huge PMD + * through this calling path. + */ + stage2_set_pte(kvm, NULL, gpa, pte, 0); } @@ -1461,7 +1532,8 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, bool writable = !(mem->flags & KVM_MEM_READONLY); int ret = 0; - if (change != KVM_MR_CREATE && change != KVM_MR_MOVE) + if (change != KVM_MR_CREATE && change != KVM_MR_MOVE && + change != KVM_MR_FLAGS_ONLY) return 0; /* @@ -1512,6 +1584,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, phys_addr_t pa = (vma->vm_pgoff << PAGE_SHIFT) + vm_start - vma->vm_start; + /* IO region dirty page logging not allowed */ + if (memslot->flags & KVM_MEM_LOG_DIRTY_PAGES) + return -EINVAL; + ret = kvm_phys_addr_ioremap(kvm, gpa, pa, vm_end - vm_start, writable); @@ -1521,6 +1597,9 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, hva = vm_end; } while (hva < reg_end); + if (change == KVM_MR_FLAGS_ONLY) + return ret; + spin_lock(&kvm->mmu_lock); if (ret) unmap_stage2_range(kvm, mem->guest_phys_addr, mem->memory_size);