From patchwork Tue Apr 11 14:25:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 672551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76BEDC77B73 for ; Tue, 11 Apr 2023 14:26:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230298AbjDKO0O (ORCPT ); Tue, 11 Apr 2023 10:26:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230253AbjDKO0N (ORCPT ); Tue, 11 Apr 2023 10:26:13 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D40A26B6 for ; Tue, 11 Apr 2023 07:25:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681223124; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2z0ZWpIzSXqyEF2KVa3LzGMQGesZiayZZfA3zEW/CRA=; b=LVb6z0kWHS/+qoN7sCbgtYAUQdrQU0JMsR1JlbJbvxHSxEbXY3pNU9i3ynw7Rt4T547qVn nEOvjU1f15O9Sd2auRiX/Lyiwq8SLkroxS8hP5WS8sijDdHHIaET6IFcwvil6HOv65TQZH R0ejbdcRmByh2iNOK/JAoBVdNQvVOdY= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-577-jmHLwFX-O2O_HjNQNHWeyw-1; Tue, 11 Apr 2023 10:25:22 -0400 X-MC-Unique: jmHLwFX-O2O_HjNQNHWeyw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 94B9B3C0F19B; Tue, 11 Apr 2023 14:25:18 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.95]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8217D1417202; Tue, 11 Apr 2023 14:25:16 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, David Hildenbrand , Andrew Morton , "David S. Miller" , Peter Xu , Hugh Dickins , Shuah Khan , Sam Ravnborg , Yu Zhao , Anshuman Khandual Subject: [PATCH v1 RESEND 1/6] selftests/mm: reuse read_pmd_pagesize() in COW selftest Date: Tue, 11 Apr 2023 16:25:07 +0200 Message-Id: <20230411142512.438404-2-david@redhat.com> In-Reply-To: <20230411142512.438404-1-david@redhat.com> References: <20230411142512.438404-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org The COW selftest can deal with THP not being configured. So move error handling of read_pmd_pagesize() into the callers such that we can reuse it in the COW selftest. Signed-off-by: David Hildenbrand --- tools/testing/selftests/mm/cow.c | 33 +++---------------- tools/testing/selftests/mm/khugepaged.c | 4 +++ tools/testing/selftests/mm/soft-dirty.c | 3 ++ .../selftests/mm/split_huge_page_test.c | 4 +++ tools/testing/selftests/mm/vm_util.c | 4 +-- 5 files changed, 17 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c index 0eb2e8180aa5..dc9d6fe86028 100644 --- a/tools/testing/selftests/mm/cow.c +++ b/tools/testing/selftests/mm/cow.c @@ -45,34 +45,6 @@ static size_t hugetlbsizes[10]; static int gup_fd; static bool has_huge_zeropage; -static void detect_thpsize(void) -{ - int fd = open("/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", - O_RDONLY); - size_t size = 0; - char buf[15]; - int ret; - - if (fd < 0) - return; - - ret = pread(fd, buf, sizeof(buf), 0); - if (ret > 0 && ret < sizeof(buf)) { - buf[ret] = 0; - - size = strtoul(buf, NULL, 10); - if (size < pagesize) - size = 0; - if (size > 0) { - thpsize = size; - ksft_print_msg("[INFO] detected THP size: %zu KiB\n", - thpsize / 1024); - } - } - - close(fd); -} - static void detect_huge_zeropage(void) { int fd = open("/sys/kernel/mm/transparent_hugepage/use_zero_page", @@ -1741,7 +1713,10 @@ int main(int argc, char **argv) int err; pagesize = getpagesize(); - detect_thpsize(); + thpsize = read_pmd_pagesize(); + if (thpsize) + ksft_print_msg("[INFO] detected THP size: %zu KiB\n", + thpsize / 1024); detect_hugetlbsizes(); detect_huge_zeropage(); diff --git a/tools/testing/selftests/mm/khugepaged.c b/tools/testing/selftests/mm/khugepaged.c index 64126c8cd561..97adc0f34f9c 100644 --- a/tools/testing/selftests/mm/khugepaged.c +++ b/tools/testing/selftests/mm/khugepaged.c @@ -1476,6 +1476,10 @@ int main(int argc, const char **argv) page_size = getpagesize(); hpage_pmd_size = read_pmd_pagesize(); + if (!hpage_pmd_size) { + printf("Reading PMD pagesize failed"); + exit(EXIT_FAILURE); + } hpage_pmd_nr = hpage_pmd_size / page_size; default_settings.khugepaged.max_ptes_none = hpage_pmd_nr - 1; diff --git a/tools/testing/selftests/mm/soft-dirty.c b/tools/testing/selftests/mm/soft-dirty.c index 21d8830c5f24..cc5f144430d4 100644 --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -80,6 +80,9 @@ static void test_hugepage(int pagemap_fd, int pagesize) int i, ret; size_t hpage_len = read_pmd_pagesize(); + if (!hpage_len) + ksft_exit_fail_msg("Reading PMD pagesize failed"); + map = memalign(hpage_len, hpage_len); if (!map) ksft_exit_fail_msg("memalign failed\n"); diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index 76e1c36dd9e5..1dc5804b8b2b 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -300,6 +300,10 @@ int main(int argc, char **argv) pagesize = getpagesize(); pageshift = ffs(pagesize) - 1; pmd_pagesize = read_pmd_pagesize(); + if (!pmd_pagesize) { + printf("Reading PMD pagesize failed\n"); + exit(EXIT_FAILURE); + } split_pmd_thp(); split_pte_mapped_thp(); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index 40e795624ff3..8dc74dd022c2 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -84,12 +84,12 @@ uint64_t read_pmd_pagesize(void) fd = open(PMD_SIZE_FILE_PATH, O_RDONLY); if (fd == -1) - ksft_exit_fail_msg("Open hpage_pmd_size failed\n"); + return 0; num_read = read(fd, buf, 19); if (num_read < 1) { close(fd); - ksft_exit_fail_msg("Read hpage_pmd_size failed\n"); + return 0; } buf[num_read] = '\0'; close(fd); From patchwork Tue Apr 11 14:25:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 672550 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6075CC77B70 for ; Tue, 11 Apr 2023 14:26:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230409AbjDKO0o (ORCPT ); Tue, 11 Apr 2023 10:26:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230417AbjDKO03 (ORCPT ); Tue, 11 Apr 2023 10:26:29 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1868D2D77 for ; Tue, 11 Apr 2023 07:25:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681223125; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kiblhRIXN5QVGSvHoajKRlNDk3I/DcpYGGNog0nXOOg=; b=Kq0TXXo4M1uMLBk0IJu6GITcMxZwJblqw73KROfYPhuBxF6OrHQ27Ihj7aK6RqfktYc748 dU8ASCdcP5cCZgKB6ieHJCtlocFGk2QcyhAU3D4+WYjn5zjlTW6EqrEctVJJwkbVeMCDw9 /gI/9vU2zZgB7G6khGTWUqkcqGeHaiI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638-WwIwvrdTOo2r1pNFp4lo0Q-1; Tue, 11 Apr 2023 10:25:22 -0400 X-MC-Unique: WwIwvrdTOo2r1pNFp4lo0Q-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 57E2328237CB; Tue, 11 Apr 2023 14:25:21 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.95]) by smtp.corp.redhat.com (Postfix) with ESMTP id F133214171D9; Tue, 11 Apr 2023 14:25:18 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, David Hildenbrand , Andrew Morton , "David S. Miller" , Peter Xu , Hugh Dickins , Shuah Khan , Sam Ravnborg , Yu Zhao , Anshuman Khandual Subject: [PATCH v1 RESEND 2/6] selftests/mm: mkdirty: test behavior of (pte|pmd)_mkdirty on VMAs without write permissions Date: Tue, 11 Apr 2023 16:25:08 +0200 Message-Id: <20230411142512.438404-3-david@redhat.com> In-Reply-To: <20230411142512.438404-1-david@redhat.com> References: <20230411142512.438404-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Let's add some tests that trigger (pte|pmd)_mkdirty on VMAs without write permissions. If an architecture implementation is wrong, we might accidentally set the PTE/PMD writable and allow for write access in a VMA without write permissions. The tests include reproducers for the two issues recently discovered and worked-around in core-MM for now: (1) commit 624a2c94f5b7 ("Partly revert "mm/thp: carry over dirty bit when thp splits on pmd"") (2) commit 96a9c287e25d ("mm/migrate: fix wrongly apply write bit after mkdirty on sparc64") In addition, some other tests that reveal further issues. All tests pass under x86_64: ./mkdirty # [INFO] detected THP size: 2048 KiB TAP version 13 1..6 # [INFO] PTRACE write access ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY ok 6 SIGSEGV generated, page not modified # Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0 But some fail on sparc64: ./mkdirty # [INFO] detected THP size: 8192 KiB TAP version 13 1..6 # [INFO] PTRACE write access not ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP not ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 SIGSEGV generated, page not modified Bail out! 3 out of 6 tests failed # Totals: pass:3 fail:3 xfail:0 xpass:0 skip:0 error:0 Reverting both above commits makes all tests fail on sparc64: ./mkdirty # [INFO] detected THP size: 8192 KiB TAP version 13 1..6 # [INFO] PTRACE write access not ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP not ok 2 SIGSEGV generated, page not modified # [INFO] Page migration not ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP not ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP not ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 SIGSEGV generated, page not modified Bail out! 6 out of 6 tests failed # Totals: pass:0 fail:6 xfail:0 xpass:0 skip:0 error:0 The tests are useful to detect other problematic archs, to verify new arch fixes, and to stop such issues from reappearing in the future. For now, we don't add any hugetlb tests. Signed-off-by: David Hildenbrand --- tools/testing/selftests/mm/Makefile | 2 + tools/testing/selftests/mm/mkdirty.c | 379 +++++++++++++++++++++++++++ 2 files changed, 381 insertions(+) create mode 100644 tools/testing/selftests/mm/mkdirty.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index c31d952cff68..8235dddbbbc6 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -47,6 +47,7 @@ TEST_GEN_FILES += map_hugetlb TEST_GEN_FILES += map_populate TEST_GEN_FILES += memfd_secret TEST_GEN_FILES += migration +TEST_GEN_PROGS += mkdirty TEST_GEN_FILES += mlock-random-test TEST_GEN_FILES += mlock2-tests TEST_GEN_FILES += mrelease_test @@ -108,6 +109,7 @@ $(OUTPUT)/cow: vm_util.c $(OUTPUT)/khugepaged: vm_util.c $(OUTPUT)/ksm_functional_tests: vm_util.c $(OUTPUT)/madv_populate: vm_util.c +$(OUTPUT)/mkdirty: vm_util.c $(OUTPUT)/soft-dirty: vm_util.c $(OUTPUT)/split_huge_page_test: vm_util.c $(OUTPUT)/userfaultfd: vm_util.c diff --git a/tools/testing/selftests/mm/mkdirty.c b/tools/testing/selftests/mm/mkdirty.c new file mode 100644 index 000000000000..6d71d972997b --- /dev/null +++ b/tools/testing/selftests/mm/mkdirty.c @@ -0,0 +1,379 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test handling of code that might set PTE/PMD dirty in read-only VMAs. + * Setting a PTE/PMD dirty must not accidentally set the PTE/PMD writable. + * + * Copyright 2023, Red Hat, Inc. + * + * Author(s): David Hildenbrand + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest.h" +#include "vm_util.h" + +static size_t pagesize; +static size_t thpsize; +static int mem_fd; +static int pagemap_fd; +static sigjmp_buf env; + +static void signal_handler(int sig) +{ + if (sig == SIGSEGV) + siglongjmp(env, 1); + siglongjmp(env, 2); +} + +static void do_test_write_sigsegv(char *mem) +{ + char orig = *mem; + int ret; + + if (signal(SIGSEGV, signal_handler) == SIG_ERR) { + ksft_test_result_fail("signal() failed\n"); + return; + } + + ret = sigsetjmp(env, 1); + if (!ret) + *mem = orig + 1; + + if (signal(SIGSEGV, SIG_DFL) == SIG_ERR) + ksft_test_result_fail("signal() failed\n"); + + ksft_test_result(ret == 1 && *mem == orig, + "SIGSEGV generated, page not modified\n"); +} + +static char *mmap_thp_range(int prot, char **_mmap_mem, size_t *_mmap_size) +{ + const size_t mmap_size = 2 * thpsize; + char *mem, *mmap_mem; + + mmap_mem = mmap(NULL, mmap_size, prot, MAP_PRIVATE|MAP_ANON, + -1, 0); + if (mmap_mem == MAP_FAILED) { + ksft_test_result_fail("mmap() failed\n"); + return MAP_FAILED; + } + mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1)); + + if (madvise(mem, thpsize, MADV_HUGEPAGE)) { + ksft_test_result_skip("MADV_HUGEPAGE failed\n"); + munmap(mmap_mem, mmap_size); + return MAP_FAILED; + } + + *_mmap_mem = mmap_mem; + *_mmap_size = mmap_size; + return mem; +} + +static void test_ptrace_write(void) +{ + char data = 1; + char *mem; + int ret; + + ksft_print_msg("[INFO] PTRACE write access\n"); + + mem = mmap(NULL, pagesize, PROT_READ, MAP_PRIVATE|MAP_ANON, -1, 0); + if (mem == MAP_FAILED) { + ksft_test_result_fail("mmap() failed\n"); + return; + } + + /* Fault in the shared zeropage. */ + if (*mem != 0) { + ksft_test_result_fail("Memory not zero\n"); + goto munmap; + } + + /* + * Unshare the page (populating a fresh anon page that might be set + * dirty in the PTE) in the read-only VMA using ptrace (FOLL_FORCE). + */ + lseek(mem_fd, (uintptr_t) mem, SEEK_SET); + ret = write(mem_fd, &data, 1); + if (ret != 1 || *mem != data) { + ksft_test_result_fail("write() failed\n"); + goto munmap; + } + + do_test_write_sigsegv(mem); +munmap: + munmap(mem, pagesize); +} + +static void test_ptrace_write_thp(void) +{ + char *mem, *mmap_mem; + size_t mmap_size; + char data = 1; + int ret; + + ksft_print_msg("[INFO] PTRACE write access to THP\n"); + + mem = mmap_thp_range(PROT_READ, &mmap_mem, &mmap_size); + if (mem == MAP_FAILED) + return; + + /* + * Write to the first subpage in the read-only VMA using + * ptrace(FOLL_FORCE), eventually placing a fresh THP that is marked + * dirty in the PMD. + */ + lseek(mem_fd, (uintptr_t) mem, SEEK_SET); + ret = write(mem_fd, &data, 1); + if (ret != 1 || *mem != data) { + ksft_test_result_fail("write() failed\n"); + goto munmap; + } + + /* MM populated a THP if we got the last subpage populated as well. */ + if (!pagemap_is_populated(pagemap_fd, mem + thpsize - pagesize)) { + ksft_test_result_skip("Did not get a THP populated\n"); + goto munmap; + } + + do_test_write_sigsegv(mem); +munmap: + munmap(mmap_mem, mmap_size); +} + +static void test_page_migration(void) +{ + char *mem; + + ksft_print_msg("[INFO] Page migration\n"); + + mem = mmap(NULL, pagesize, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, + -1, 0); + if (mem == MAP_FAILED) { + ksft_test_result_fail("mmap() failed\n"); + return; + } + + /* Populate a fresh page and dirty it. */ + memset(mem, 1, pagesize); + if (mprotect(mem, pagesize, PROT_READ)) { + ksft_test_result_fail("mprotect() failed\n"); + goto munmap; + } + + /* Trigger page migration. Might not be available or fail. */ + if (syscall(__NR_mbind, mem, pagesize, MPOL_LOCAL, NULL, 0x7fful, + MPOL_MF_MOVE)) { + ksft_test_result_skip("mbind() failed\n"); + goto munmap; + } + + do_test_write_sigsegv(mem); +munmap: + munmap(mem, pagesize); +} + +static void test_page_migration_thp(void) +{ + char *mem, *mmap_mem; + size_t mmap_size; + + ksft_print_msg("[INFO] Page migration of THP\n"); + + mem = mmap_thp_range(PROT_READ|PROT_WRITE, &mmap_mem, &mmap_size); + if (mem == MAP_FAILED) + return; + + /* + * Write to the first page, which might populate a fresh anon THP + * and dirty it. + */ + memset(mem, 1, pagesize); + if (mprotect(mem, thpsize, PROT_READ)) { + ksft_test_result_fail("mprotect() failed\n"); + goto munmap; + } + + /* MM populated a THP if we got the last subpage populated as well. */ + if (!pagemap_is_populated(pagemap_fd, mem + thpsize - pagesize)) { + ksft_test_result_skip("Did not get a THP populated\n"); + goto munmap; + } + + /* Trigger page migration. Might not be available or fail. */ + if (syscall(__NR_mbind, mem, thpsize, MPOL_LOCAL, NULL, 0x7fful, + MPOL_MF_MOVE)) { + ksft_test_result_skip("mbind() failed\n"); + goto munmap; + } + + do_test_write_sigsegv(mem); +munmap: + munmap(mmap_mem, mmap_size); +} + +static void test_pte_mapped_thp(void) +{ + char *mem, *mmap_mem; + size_t mmap_size; + + ksft_print_msg("[INFO] PTE-mapping a THP\n"); + + mem = mmap_thp_range(PROT_READ|PROT_WRITE, &mmap_mem, &mmap_size); + if (mem == MAP_FAILED) + return; + + /* + * Write to the first page, which might populate a fresh anon THP + * and dirty it. + */ + memset(mem, 1, pagesize); + if (mprotect(mem, thpsize, PROT_READ)) { + ksft_test_result_fail("mprotect() failed\n"); + goto munmap; + } + + /* MM populated a THP if we got the last subpage populated as well. */ + if (!pagemap_is_populated(pagemap_fd, mem + thpsize - pagesize)) { + ksft_test_result_skip("Did not get a THP populated\n"); + goto munmap; + } + + /* Trigger PTE-mapping the THP by mprotect'ing the last subpage. */ + if (mprotect(mem + thpsize - pagesize, pagesize, + PROT_READ|PROT_WRITE)) { + ksft_test_result_fail("mprotect() failed\n"); + goto munmap; + } + + do_test_write_sigsegv(mem); +munmap: + munmap(mmap_mem, mmap_size); +} + +#ifdef __NR_userfaultfd +static void test_uffdio_copy(void) +{ + struct uffdio_register uffdio_register; + struct uffdio_copy uffdio_copy; + struct uffdio_api uffdio_api; + char *dst, *src; + int uffd; + + ksft_print_msg("[INFO] UFFDIO_COPY\n"); + + src = malloc(pagesize); + memset(src, 1, pagesize); + dst = mmap(NULL, pagesize, PROT_READ, MAP_PRIVATE|MAP_ANON, -1, 0); + if (dst == MAP_FAILED) { + ksft_test_result_fail("mmap() failed\n"); + return; + } + + uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + if (uffd < 0) { + ksft_test_result_skip("__NR_userfaultfd failed\n"); + goto munmap; + } + + uffdio_api.api = UFFD_API; + uffdio_api.features = 0; + if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { + ksft_test_result_fail("UFFDIO_API failed\n"); + goto close_uffd; + } + + uffdio_register.range.start = (unsigned long) dst; + uffdio_register.range.len = pagesize; + uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING; + if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register)) { + ksft_test_result_fail("UFFDIO_REGISTER failed\n"); + goto close_uffd; + } + + /* Place a page in a read-only VMA, which might set the PTE dirty. */ + uffdio_copy.dst = (unsigned long) dst; + uffdio_copy.src = (unsigned long) src; + uffdio_copy.len = pagesize; + uffdio_copy.mode = 0; + if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy)) { + ksft_test_result_fail("UFFDIO_COPY failed\n"); + goto close_uffd; + } + + do_test_write_sigsegv(dst); +close_uffd: + close(uffd); +munmap: + munmap(dst, pagesize); + free(src); +#endif /* __NR_userfaultfd */ +} + +int main(void) +{ + int err, tests = 2; + + pagesize = getpagesize(); + thpsize = read_pmd_pagesize(); + if (thpsize) { + ksft_print_msg("[INFO] detected THP size: %zu KiB\n", + thpsize / 1024); + tests += 3; + } +#ifdef __NR_userfaultfd + tests += 1; +#endif /* __NR_userfaultfd */ + + ksft_print_header(); + ksft_set_plan(tests); + + mem_fd = open("/proc/self/mem", O_RDWR); + if (mem_fd < 0) + ksft_exit_fail_msg("opening /proc/self/mem failed\n"); + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + ksft_exit_fail_msg("opening /proc/self/pagemap failed\n"); + + /* + * On some ptrace(FOLL_FORCE) write access via /proc/self/mem in + * read-only VMAs, the kernel may set the PTE/PMD dirty. + */ + test_ptrace_write(); + if (thpsize) + test_ptrace_write_thp(); + /* + * On page migration, the kernel may set the PTE/PMD dirty when + * remapping the page. + */ + test_page_migration(); + if (thpsize) + test_page_migration_thp(); + /* PTE-mapping a THP might propagate the dirty PMD bit to the PTEs. */ + if (thpsize) + test_pte_mapped_thp(); + /* Placing a fresh page via userfaultfd may set the PTE dirty. */ +#ifdef __NR_userfaultfd + test_uffdio_copy(); +#endif /* __NR_userfaultfd */ + + err = ksft_get_fail_cnt(); + if (err) + ksft_exit_fail_msg("%d out of %d tests failed\n", + err, ksft_test_num()); + return ksft_exit_pass(); +} From patchwork Tue Apr 11 14:25:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 672549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76FDFC77B6F for ; Tue, 11 Apr 2023 14:27:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229640AbjDKO1K (ORCPT ); Tue, 11 Apr 2023 10:27:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230477AbjDKO1A (ORCPT ); Tue, 11 Apr 2023 10:27:00 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 332814C03 for ; Tue, 11 Apr 2023 07:25:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681223135; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qzK4ogixuaREht4MpkiqqhFPvzGg6ALQmBRRBO9bEhw=; b=EKmfHWR9HxArZ7X2PHCZeF6T/VlOEWW6Fblifxeh5r3JRgGJvWYaTvlJvmw/nGl+mFuJRu QPsXeOyXiELrTHHqg3qTVYrxJrl5uLUr2xtIabKyAMsUhP4BQRstgupcYcN7Pr7r7kmEEk XPEsrEfXxDf5WDNxbTh48B3kyx+2HcI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-333-CZ7gWNr3MiGiLm_dWDQrxw-1; Tue, 11 Apr 2023 10:25:32 -0400 X-MC-Unique: CZ7gWNr3MiGiLm_dWDQrxw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DCE451C27D9C; Tue, 11 Apr 2023 14:25:31 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.95]) by smtp.corp.redhat.com (Postfix) with ESMTP id 826DB14171D6; Tue, 11 Apr 2023 14:25:29 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-kselftest@vger.kernel.org, sparclinux@vger.kernel.org, David Hildenbrand , Andrew Morton , "David S. Miller" , Peter Xu , Hugh Dickins , Shuah Khan , Sam Ravnborg , Yu Zhao , Anshuman Khandual Subject: [PATCH v1 RESEND 6/6] mm/huge_memory: conditionally call maybe_mkwrite() and drop pte_wrprotect() in __split_huge_pmd_locked() Date: Tue, 11 Apr 2023 16:25:12 +0200 Message-Id: <20230411142512.438404-7-david@redhat.com> In-Reply-To: <20230411142512.438404-1-david@redhat.com> References: <20230411142512.438404-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org No need to call maybe_mkwrite() to then wrprotect if the source PMD was not writable. It's worth nothing that this now allows for PTEs to be writable even if the source PMD was not writable: if vma->vm_page_prot includes write permissions. As documented in commit 931298e103c2 ("mm/userfaultfd: rely on vma->vm_page_prot in uffd_wp_range()"), any mechanism that intends to have pages wrprotected (COW, writenotify, mprotect, uffd-wp, softdirty, ...) has to properly adjust vma->vm_page_prot upfront, to not include write permissions. If vma->vm_page_prot includes write permissions, the PTE/PMD can be writable as default. This now mimics the handling in mm/migrate.c:remove_migration_pte() and in mm/huge_memory.c:remove_migration_pmd(), which has been in place for a long time (except that 96a9c287e25d ("mm/migrate: fix wrongly apply write bit after mkdirty on sparc64") temporarily changed it). Signed-off-by: David Hildenbrand --- mm/huge_memory.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6f3af65435c8..8332e16ac97b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2235,11 +2235,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_swp_mkuffd_wp(entry); } else { entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); - entry = maybe_mkwrite(entry, vma); + if (write) + entry = maybe_mkwrite(entry, vma); if (anon_exclusive) SetPageAnonExclusive(page + i); - if (!write) - entry = pte_wrprotect(entry); if (!young) entry = pte_mkold(entry); /* NOTE: this may set soft-dirty too on some archs */