From patchwork Fri Jan 10 14:55:44 2025
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Adhemerval Zanella <adhemerval.zanella@linaro.org>
X-Patchwork-Id: 856255
Delivered-To: patch@linaro.org
Received: by 2002:a5d:525c:0:b0:385:e875:8a9e with SMTP id k28csp266616wrc;
 Fri, 10 Jan 2025 06:56:29 -0800 (PST)
X-Forwarded-Encrypted: i=3;
 AJvYcCXF90gAchxE66SbhUObGeDrKtygG4ef4hpb6d9MPAcr/Y2lN6atJ7COilq0csE0LqEc2BjCqQ==@linaro.org
X-Google-Smtp-Source: AGHT+IHoOnnRbLQgfEXI6eZeTzSQLWeMqzPSOu/s12mwYJe/RJ9QA0djPJ41VR6wNNi12ew6aN8H
X-Received: by 2002:a05:620a:2589:b0:7b6:cf60:3971 with SMTP id
 af79cd13be357-7bcd97095bfmr1760136985a.22.1736520989141;
 Fri, 10 Jan 2025 06:56:29 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1736520989; cv=pass;
 d=google.com; s=arc-20240605;
 b=QCTEPFomXsHdgpZALI3lVaWxG+JpiI7sMvLNph+XaynW295qegHryl4IcNMs1RENDv
 J/DiiMpKnqyXiEe8bRdNzSBz7pP72vh/Ohn4vzVckYQhHA0mFGMNHI/cfLmlFJICtvVl
 vNMuYS+4wxf6ysbDt5sXYCJZBWEjwbZ9cifuVUUSR9r1MAM9qWvi1xoJ8cwsWrH8oho8
 M/XDz0RfxKWEUDknAIqr/aVXtUWmPJdq6iUkU+REimibRFJDuLww80IMaHIfpHt8KYnu
 t4K70YOoh5yezpEtLW+mHX5c8seeQwRZwOi0aJGnVUODMx/de3M3YTFj1X9TSKWBOoA3
 5Z8Q==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20240605;
 h=errors-to:list-subscribe:list-help:list-post:list-archive
 :list-unsubscribe:list-id:precedence:content-transfer-encoding
 :mime-version:message-id:date:subject:cc:to:from:dkim-signature
 :arc-filter:dmarc-filter:delivered-to;
 bh=TmbMEr6gDHS/6LZ+Gn8BHGxBCZW6d7jP4DjXMluFlp8=;
 fh=kxe/wfJybIMwOF5l32zk6CyzYVmV/lEsdubStKtynhc=;
 b=JpeUgDZXH54N9AFlzxVUIO0kVdMwLa+zKxPbBOZiIrODSU+fxaekS1k9oqu5yQOjkX
 kXmeBBStHdK/OyyU8hzO95NwbGcvrJltrKb+0uQRVJKFbF9bLaX0sE8MMIPAFjESxYXT
 mI5h8XQHyZ/nbdfbhzJ0pz+fXNG8rgQnzhYKiF9cRId8ZFbWU/MhIOBCN3tMbf+jBHAb
 bLy+/22AkGIdJfBsJ88R3B8dcgXKVn3pcsIZvPXjlZ21ja52JSivTnYj/MBx5yidvUQw
 4W9QEfW/5n/kO7O5W+/QRxZrRxvqGUpgAbfSycEG6o0Icvnsdhz8lHO21/2m5jsbeVUm
 0dZA==; dara=google.com
ARC-Authentication-Results: i=2; mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=Frd3y4fM;
 arc=pass (i=1); spf=pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <libc-alpha-bounces~patch=linaro.org@sourceware.org>
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id
 af79cd13be357-7bce322a3f3si407255685a.39.2025.01.10.06.56.28
 for <patch@linaro.org>
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 10 Jan 2025 06:56:29 -0800 (PST)
Received-SPF: pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=Frd3y4fM;
 arc=pass (i=1); spf=pass (google.com: domain of
 libc-alpha-bounces~patch=linaro.org@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
 by sourceware.org (Postfix) with ESMTP id B7F393858280
 for <patch@linaro.org>; Fri, 10 Jan 2025 14:56:28 +0000 (GMT)
X-Original-To: libc-alpha@sourceware.org
Delivered-To: libc-alpha@sourceware.org
Received: from mail-vs1-xe33.google.com (mail-vs1-xe33.google.com
 [IPv6:2607:f8b0:4864:20::e33])
 by sourceware.org (Postfix) with ESMTPS id 1ECFA3857C4F
 for <libc-alpha@sourceware.org>; Fri, 10 Jan 2025 14:56:03 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1ECFA3857C4F
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=linaro.org
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1ECFA3857C4F
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2607:f8b0:4864:20::e33
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1736520963; cv=none;
 b=j0iArU8IZCwysm+lET0LGOAc1bXoOCDh4k85i2IrA8Sjlx6ujv8TwK+3rG3fTagZJxxiSYc2lpmlYzICvw5Ydwk9IAQR2H39oA3/QTuj8/6q7V1mfs/Q+y32Sk9krdBuCw+uM8WX+nCUfu9duVFekobXJPOzV8CwqmDuIpXYdRs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1736520963; c=relaxed/simple;
 bh=7SROsRaZUM52hBPBN4s7Oox4QemP8mf0sBj8ISwgPM0=;
 h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;
 b=n9RTsTenUwTZfLY6XeYBQO1LAfaz0Ms9+XpvCwfotyeSLcoZAExA8qBeGF5PbxI+uyqI3Q7MRdoaw2W3uGfPakJfc+x82d986AsJqgmEcU9+gqtmFC7/UjE8VObAyOHRaEBUaj1d8dGq4IOvrQ5/Gqf0QTZGmfHujRFkEZ9YZ20=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: by mail-vs1-xe33.google.com with SMTP id
 ada2fe7eead31-4b10dd44c8bso759981137.3
 for <libc-alpha@sourceware.org>; Fri, 10 Jan 2025 06:56:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=linaro.org; s=google; t=1736520962; x=1737125762; darn=sourceware.org;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:from:to:cc:subject:date:message-id:reply-to;
 bh=TmbMEr6gDHS/6LZ+Gn8BHGxBCZW6d7jP4DjXMluFlp8=;
 b=Frd3y4fMNyChPp0jAXaUH5HhHUlhj+sMCjFzAc9PIbH8FDDmBss/zDgeo9QSy6CDzK
 91ykXUFex0nv1aaBcyh2exnDK7Xx7Ljsd1NNS0ykoFyKTiNexDJExyVoLz2DXnKwTLOe
 si9EuNNOJe6xFEevmKjIrQHGdZN/MLw+BThnafODY8haKgPK0IABcySHNQRXaamfjQ1I
 Dv21KcEEyBvvBCcdnH4WAbUuYwH2sbEGoQuL4FC8TPfmzheU2zl679MaxNPUi+bcinSI
 osQzSUg5gxF0RwldFx1DkPox8OhfNLWllaqRuszc0MaFqjZ36FBjQqDUyvHldm6APN7V
 Sjqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1736520962; x=1737125762;
 h=content-transfer-encoding:mime-version:message-id:date:subject:cc
 :to:from:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=TmbMEr6gDHS/6LZ+Gn8BHGxBCZW6d7jP4DjXMluFlp8=;
 b=sJb/DeG9aJN896xo5GtzgR8Sddmb3GJOeNbsZIIWV47vEjDyqj02/NWm6a9haoc4rp
 9VP3bLQHy3Ck2+FwNNpN7l+lLuPBbxky9BikHkhAX6e2R4md6qSoV+eGxlWTEXQAyeNO
 3CfQof3/QjIZ6hJl1k1pdxAJfhS+GsFUlcHuOJo4eQZsUHNifWs6ar31rc4cOapkq40S
 jv+PYVKGsEzkoY/52P10SDufC21WokdKL815Fl1kWtH/2NPWaJtJH+jRzOYnwXcjH7XV
 ZRWnjAeBDlZxjM6+HD1YRxnNaE+FnkNBnTBsnsJZPWsGcNKXwoOEtxOX5k/XcbHFJoww
 5/hw==
X-Gm-Message-State: AOJu0YyA6I4rfgZ7AaLIbfh0jgq/PG2dMyZlR90dxRXgsd5sQShfnIrN
 iKFDQLKgS1UjUd+CZ2OLOoZYAcQLTX5zI13SJ10peqB6+iL+O7CFoBo32tlNDrX7gtsPe8VuBrY
 H
X-Gm-Gg: ASbGncu8jQRrqto3XRQMjK3cLnnX/Dx/ajnhrDbHhpcfzfqVnezeET9o7apa92nnSFM
 xbpOxVfPqSvrmJgE0CDhPoS9Cbp1+nYwzyCHAhZg5u0N9qRr37w+0ZDKX3JT4QLovQU1VvW8CxI
 2ABH9WXVdwj8x2M0cckSPs6Krfxes0T/btAOhdjHj9dQTxwjUg4a9Rd1hn38aAloAyPufcDWE/w
 bfn5AMFmKSHOEdLMgtsZzCsS74X02PQNaRH544viV/HcrXP9PhjN04h2Z3l8/KGYzvMAQ==
X-Received: by 2002:a05:6102:3a08:b0:4af:ed5a:b697 with SMTP id
 ada2fe7eead31-4b3d0fc25camr10652017137.13.1736520961573;
 Fri, 10 Jan 2025 06:56:01 -0800 (PST)
Received: from ubuntu-vm.. ([191.23.120.207]) by smtp.gmail.com with ESMTPSA id
 a1e0cc1a2514c-86231362217sm2796108241.12.2025.01.10.06.55.59
 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
 Fri, 10 Jan 2025 06:56:01 -0800 (PST)
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>, =?utf-8?q?Cristian_Rodr?=
 =?utf-8?q?=C3=ADguez?= <crrodriguez@opensuse.org>
Subject: [PATCH] nptl: Add support for setup guard pages with
 MADV_GUARD_INSTALL
Date: Fri, 10 Jan 2025 11:55:44 -0300
Message-ID: <20250110145556.520522-1-adhemerval.zanella@linaro.org>
X-Mailer: git-send-email 2.43.0
MIME-Version: 1.0
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org

The Linux 6.13 (662df3e5c3766) added a lightweight way to define guard
pages through madvise syscall.  Instead of using PROT_NONE mmap regions
through mprotect, userland can madvise the same region and kernel to
ensure that accessing the area will trigger a SIGSEGV (as for PROT_NONE
mapping).

It has both an advantage of less kernel memory consumption for the
process page-table (one less VMA per guard area), and slight less
contention on kernel (also due the less VMA areas being tracked).

The pthread_create allocates a new thread stack in two ways: if a guard
area is set (the default) it allocates the memory range required using
PROT_NONE and then mprotect the usable stack area. Otherwise, if a
guard page is not set it allocates the region with the required flags.

For the MADV_GUARD_INSTALL support, the stack area regions is allocated
with required flags and then guard region is madvise.  If the kernel
does not support it, the usual way is used instead (and
ADV_GUARD_INSTALL is disabled for future stack creations).

The stack allocation strategy is recorded on the pthread struct, it is
used in case the guard region needs to be resized.  To avoid the need
for an extra field, the 'user_stack' is repurposed and renamed to
'stack_mode'.

This patch also adds a proper test for the pthread guard.

I checked on x86_64, aarch64, and powerpc64le with kernel 6.13.0-rc4.
I also checked the new test on hppa with Linux 5.16.0 (although I could
not boot a 6.13 to check MADV_GUARD_INSTALL).
---
 nptl/Makefile                             |   1 +
 nptl/TODO-testing                         |   4 -
 nptl/allocatestack.c                      | 229 +++++++++-----
 nptl/descr.h                              |   8 +-
 nptl/nptl-stack.c                         |   2 +-
 nptl/pthread_create.c                     |   2 +-
 nptl/tst-guard1.c                         | 366 ++++++++++++++++++++++
 sysdeps/nptl/dl-tls_init_tp.c             |   2 +-
 sysdeps/nptl/fork.h                       |   2 +-
 sysdeps/unix/sysv/linux/bits/mman-linux.h |   2 +
 10 files changed, 523 insertions(+), 95 deletions(-)
 create mode 100644 nptl/tst-guard1.c

diff --git a/nptl/Makefile b/nptl/Makefile
index b7c63999a3..b04e25cd0d 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -289,6 +289,7 @@ tests = \
   tst-dlsym1 \
   tst-exec4 \
   tst-exec5 \
+  tst-guard1 \
   tst-initializers1 \
   tst-initializers1-c11 \
   tst-initializers1-c89 \
diff --git a/nptl/TODO-testing b/nptl/TODO-testing
index f50d2ceb51..46ebf3bc5c 100644
--- a/nptl/TODO-testing
+++ b/nptl/TODO-testing
@@ -1,7 +1,3 @@
-pthread_attr_setguardsize
-
-  test effectiveness
-
 pthread_attr_[sg]etschedparam
 
   what to test?
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 9c1a72bcf0..acae232765 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -146,10 +146,37 @@ get_cached_stack (size_t *sizep, void **memp)
   return result;
 }
 
+/* Assume support for MADV_ADVISE_GUARD, setup_stack_prot will disable it
+   and fallback to ALLOCATE_GUARD_PROT_NONE if the madvise call fails.  */
+static int allocate_stack_mode = ALLOCATE_GUARD_MADV_GUARD;
+
+static inline int stack_prot (void)
+{
+  return (PROT_READ | PROT_WRITE
+	  | ((GL(dl_stack_flags) & PF_X) ? PROT_EXEC : 0));
+}
+
+static void *
+allocate_thread_stack (size_t size, size_t guardsize)
+{
+  /* MADV_ADVISE_GUARD does not require an additional PROT_NONE mapping.  */
+  int prot = stack_prot ();
+
+  if (atomic_load_relaxed (&allocate_stack_mode) == ALLOCATE_GUARD_PROT_NONE)
+    /* If a guard page is required, avoid committing memory by first allocate
+       with PROT_NONE and then reserve with required permission excluding the
+       guard page.  */
+    prot = guardsize == 0 ? prot : PROT_NONE;
+
+  return __mmap (NULL, size, prot, MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1,
+		 0);
+}
+
+
 /* Return the guard page position on allocated stack.  */
 static inline char *
 __attribute ((always_inline))
-guard_position (void *mem, size_t size, size_t guardsize, struct pthread *pd,
+guard_position (void *mem, size_t size, size_t guardsize, const struct pthread *pd,
 		size_t pagesize_m1)
 {
 #if _STACK_GROWS_DOWN
@@ -159,27 +186,97 @@ guard_position (void *mem, size_t size, size_t guardsize, struct pthread *pd,
 #endif
 }
 
-/* Based on stack allocated with PROT_NONE, setup the required portions with
-   'prot' flags based on the guard page position.  */
-static inline int
-setup_stack_prot (char *mem, size_t size, char *guard, size_t guardsize,
-		  const int prot)
+/* Setup the MEM thread stack of SIZE bytes with the required protection flags
+   along with a guard area of GUARDSIZE size.  It first tries with
+   MADV_GUARD_INSTALL, and then fallback to setup the guard area using the
+   extra PROT_NONE mapping.  Update PD with the type of guard area setup.  */
+static inline bool
+setup_stack_prot (char *mem, size_t size, struct pthread *pd,
+		  size_t guardsize, size_t pagesize_m1)
 {
-  char *guardend = guard + guardsize;
+  if (__glibc_unlikely (guardsize == 0))
+    return true;
+
+  char *guard = guard_position (mem, size, guardsize, pd, pagesize_m1);
+  if (atomic_load_relaxed (&allocate_stack_mode) == ALLOCATE_GUARD_MADV_GUARD)
+    {
+      if (__madvise (guard, guardsize, MADV_GUARD_INSTALL) == 0)
+	{
+	  pd->stack_mode = ALLOCATE_GUARD_MADV_GUARD;
+	  return true;
+	}
+
+      /* If madvise fails it means the kernel does not support the guard
+	 advise (we assume that guard is page-aligned and length is non
+	 negative).  The stack has already the expected flags, so it just need
+	 to PROT_NONE the guard area.  */
+      atomic_store_relaxed (&allocate_stack_mode, ALLOCATE_GUARD_PROT_NONE);
+      if (__mprotect (guard, guardsize, PROT_NONE) != 0)
+	return false;
+    }
+  else
+    {
+      const int prot = stack_prot ();
+      char *guardend = guard + guardsize;
 #if _STACK_GROWS_DOWN
-  /* As defined at guard_position, for architectures with downward stack
-     the guard page is always at start of the allocated area.  */
-  if (__mprotect (guardend, size - guardsize, prot) != 0)
-    return errno;
+      /* As defined at guard_position, for architectures with downward stack
+	 the guard page is always at start of the allocated area.  */
+      if (__mprotect (guardend, size - guardsize, prot) != 0)
+	return false;
 #else
-  size_t mprots1 = (uintptr_t) guard - (uintptr_t) mem;
-  if (__mprotect (mem, mprots1, prot) != 0)
-    return errno;
-  size_t mprots2 = ((uintptr_t) mem + size) - (uintptr_t) guardend;
-  if (__mprotect (guardend, mprots2, prot) != 0)
-    return errno;
+      size_t mprots1 = (uintptr_t) guard - (uintptr_t) mem;
+      if (__mprotect (mem, mprots1, prot) != 0)
+	return false;
+      size_t mprots2 = ((uintptr_t) mem + size) - (uintptr_t) guardend;
+      if (__mprotect (guardend, mprots2, prot) != 0)
+	return false;
 #endif
-  return 0;
+    }
+
+  pd->stack_mode = ALLOCATE_GUARD_PROT_NONE;
+  return true;
+}
+
+/* Update the guard area of the thread stack MEM of size SIZE with the
+   new GUARDISZE.  It uses the method defined by PD stack_mode.  */
+static inline bool
+adjust_stack_prot (char *mem, size_t size, const struct pthread *pd,
+		   size_t guardsize, size_t pagesize_m1)
+{
+  char *guard = guard_position (mem, size, guardsize, pd, pagesize_m1);
+  /* The required guardsize is larger than the current one.  */
+  if (guardsize > pd->guardsize)
+    {
+      if (pd->stack_mode == ALLOCATE_GUARD_MADV_GUARD)
+	return __madvise (guard, guardsize, MADV_GUARD_INSTALL) == 0;
+      else if (pd->stack_mode == ALLOCATE_GUARD_PROT_NONE)
+	return __mprotect (guard, guardsize, PROT_NONE) == 0;
+    }
+  /* The old guard are is too large.  */
+  else if (pd->guardsize > guardsize)
+    {
+      const int prot = stack_prot ();
+      size_t slacksize = pd->guardsize - guardsize;
+      if (pd->stack_mode == ALLOCATE_GUARD_MADV_GUARD)
+	return __madvise (guard + guardsize, slacksize, MADV_GUARD_REMOVE) == 0;
+      else if (pd->stack_mode == ALLOCATE_GUARD_PROT_NONE)
+#if _STACK_GROWS_DOWN
+	return __mprotect (mem + guardsize, slacksize, prot) == 0;
+#else
+	{
+	  char *new_guard = (char *)(((uintptr_t) pd - guardsize)
+				     & ~pagesize_m1);
+	  char *old_guard = (char *)(((uintptr_t) pd - pd->guardsize)
+				     & ~pagesize_m1);
+	  /* The guard size difference might be > 0, but once rounded
+	     to the nearest page the size difference might be zero.  */
+	  if (new_guard > old_guard
+	      && __mprotect (old_guard, new_guard - old_guard, prot) != 0)
+	    return false;
+	}
+#endif
+    }
+  return true;
 }
 
 /* Mark the memory of the stack as usable to the kernel.  It frees everything
@@ -291,7 +388,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 
       /* This is a user-provided stack.  It will not be queued in the
 	 stack cache nor will the memory (except the TLS memory) be freed.  */
-      pd->user_stack = true;
+      pd->stack_mode = ALLOCATE_GUARD_USER;
 
       /* This is at least the second thread.  */
       pd->header.multiple_threads = 1;
@@ -325,10 +422,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
       /* Allocate some anonymous memory.  If possible use the cache.  */
       size_t guardsize;
       size_t reported_guardsize;
-      size_t reqsize;
       void *mem;
-      const int prot = (PROT_READ | PROT_WRITE
-			| ((GL(dl_stack_flags) & PF_X) ? PROT_EXEC : 0));
 
       /* Adjust the stack size for alignment.  */
       size &= ~tls_static_align_m1;
@@ -358,16 +452,10 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 	return EINVAL;
 
       /* Try to get a stack from the cache.  */
-      reqsize = size;
       pd = get_cached_stack (&size, &mem);
       if (pd == NULL)
 	{
-	  /* If a guard page is required, avoid committing memory by first
-	     allocate with PROT_NONE and then reserve with required permission
-	     excluding the guard page.  */
-	  mem = __mmap (NULL, size, (guardsize == 0) ? prot : PROT_NONE,
-			MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
-
+	  mem = allocate_thread_stack (size, guardsize);
 	  if (__glibc_unlikely (mem == MAP_FAILED))
 	    return errno;
 
@@ -394,15 +482,10 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 #endif
 
 	  /* Now mprotect the required region excluding the guard area.  */
-	  if (__glibc_likely (guardsize > 0))
+	  if (!setup_stack_prot (mem, size, pd, guardsize, pagesize_m1))
 	    {
-	      char *guard = guard_position (mem, size, guardsize, pd,
-					    pagesize_m1);
-	      if (setup_stack_prot (mem, size, guard, guardsize, prot) != 0)
-		{
-		  __munmap (mem, size);
-		  return errno;
-		}
+	      __munmap (mem, size);
+	      return errno;
 	    }
 
 	  /* Remember the stack-related values.  */
@@ -456,59 +539,31 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 	     which will be read next.  */
 	}
 
-      /* Create or resize the guard area if necessary.  */
-      if (__glibc_unlikely (guardsize > pd->guardsize))
+      /* Create or resize the guard area if necessary on an already
+	 allocated stack.  */
+      if (!adjust_stack_prot (mem, size, pd, guardsize, pagesize_m1))
 	{
-	  char *guard = guard_position (mem, size, guardsize, pd,
-					pagesize_m1);
-	  if (__mprotect (guard, guardsize, PROT_NONE) != 0)
-	    {
-	    mprot_error:
-	      lll_lock (GL (dl_stack_cache_lock), LLL_PRIVATE);
+	  lll_lock (GL (dl_stack_cache_lock), LLL_PRIVATE);
 
-	      /* Remove the thread from the list.  */
-	      __nptl_stack_list_del (&pd->list);
+	  /* Remove the thread from the list.  */
+	  __nptl_stack_list_del (&pd->list);
 
-	      lll_unlock (GL (dl_stack_cache_lock), LLL_PRIVATE);
+	  lll_unlock (GL (dl_stack_cache_lock), LLL_PRIVATE);
 
-	      /* Get rid of the TLS block we allocated.  */
-	      _dl_deallocate_tls (TLS_TPADJ (pd), false);
+	  /* Get rid of the TLS block we allocated.  */
+	  _dl_deallocate_tls (TLS_TPADJ (pd), false);
 
-	      /* Free the stack memory regardless of whether the size
-		 of the cache is over the limit or not.  If this piece
-		 of memory caused problems we better do not use it
-		 anymore.  Uh, and we ignore possible errors.  There
-		 is nothing we could do.  */
-	      (void) __munmap (mem, size);
+	  /* Free the stack memory regardless of whether the size
+	     of the cache is over the limit or not.  If this piece
+	     of memory caused problems we better do not use it
+	     anymore.  Uh, and we ignore possible errors.  There
+	     is nothing we could do.  */
+	  (void) __munmap (mem, size);
 
-	      return errno;
-	    }
-
-	  pd->guardsize = guardsize;
+	  return errno;
 	}
-      else if (__builtin_expect (pd->guardsize - guardsize > size - reqsize,
-				 0))
-	{
-	  /* The old guard area is too large.  */
 
-#if _STACK_GROWS_DOWN
-	  if (__mprotect ((char *) mem + guardsize, pd->guardsize - guardsize,
-			prot) != 0)
-	    goto mprot_error;
-#elif _STACK_GROWS_UP
-         char *new_guard = (char *)(((uintptr_t) pd - guardsize)
-                                    & ~pagesize_m1);
-         char *old_guard = (char *)(((uintptr_t) pd - pd->guardsize)
-                                    & ~pagesize_m1);
-         /* The guard size difference might be > 0, but once rounded
-            to the nearest page the size difference might be zero.  */
-         if (new_guard > old_guard
-             && __mprotect (old_guard, new_guard - old_guard, prot) != 0)
-	    goto mprot_error;
-#endif
-
-	  pd->guardsize = guardsize;
-	}
+      pd->guardsize = guardsize;
       /* The pthread_getattr_np() calls need to get passed the size
 	 requested in the attribute, regardless of how large the
 	 actually used guardsize is.  */
@@ -568,19 +623,21 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
 static void
 name_stack_maps (struct pthread *pd, bool set)
 {
+  size_t adjust = pd->stack_mode == ALLOCATE_GUARD_PROT_NONE ?
+    pd->guardsize : 0;
 #if _STACK_GROWS_DOWN
-  void *stack = pd->stackblock + pd->guardsize;
+  void *stack = pd->stackblock + adjust;
 #else
   void *stack = pd->stackblock;
 #endif
-  size_t stacksize = pd->stackblock_size - pd->guardsize;
+  size_t stacksize = pd->stackblock_size - adjust;
 
   if (!set)
-    __set_vma_name (stack, stacksize, NULL);
+    __set_vma_name (stack, stacksize, " glibc: unused stack");
   else
     {
       unsigned int tid = pd->tid;
-      if (pd->user_stack)
+      if (pd->stack_mode == ALLOCATE_GUARD_USER)
 	SET_STACK_NAME (" glibc: pthread user stack: ", stack, stacksize, tid);
       else
 	SET_STACK_NAME (" glibc: pthread stack: ", stack, stacksize, tid);
diff --git a/nptl/descr.h b/nptl/descr.h
index d0d30929e2..9c1ed54c56 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -125,6 +125,12 @@ struct priority_protection_data
   unsigned int priomap[];
 };
 
+enum allocate_stack_mode_t
+{
+  ALLOCATE_GUARD_MADV_GUARD = 0,
+  ALLOCATE_GUARD_PROT_NONE = 1,
+  ALLOCATE_GUARD_USER = 2,
+};
 
 /* Thread descriptor data structure.  */
 struct pthread
@@ -324,7 +330,7 @@ struct pthread
   bool report_events;
 
   /* True if the user provided the stack.  */
-  bool user_stack;
+  enum allocate_stack_mode_t stack_mode;
 
   /* True if thread must stop at startup time.  */
   bool stopped_start;
diff --git a/nptl/nptl-stack.c b/nptl/nptl-stack.c
index 503357f25d..c049c5133c 100644
--- a/nptl/nptl-stack.c
+++ b/nptl/nptl-stack.c
@@ -120,7 +120,7 @@ __nptl_deallocate_stack (struct pthread *pd)
      not reset the 'used' flag in the 'tid' field.  This is done by
      the kernel.  If no thread has been created yet this field is
      still zero.  */
-  if (__glibc_likely (! pd->user_stack))
+  if (__glibc_likely (pd->stack_mode != ALLOCATE_GUARD_USER))
     (void) queue_stack (pd);
   else
     /* Free the memory associated with the ELF TLS.  */
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 01e8a86980..0808f2e628 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -554,7 +554,7 @@ start_thread (void *arg)
      to avoid creating a new free-state block during thread release.  */
   __getrandom_vdso_release (pd);
 
-  if (!pd->user_stack)
+  if (pd->stack_mode != ALLOCATE_GUARD_USER)
     advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
 			pd->guardsize);
 
diff --git a/nptl/tst-guard1.c b/nptl/tst-guard1.c
new file mode 100644
index 0000000000..ec04eeed02
--- /dev/null
+++ b/nptl/tst-guard1.c
@@ -0,0 +1,366 @@
+/* Basic tests for pthread guard area.
+   Copyright (C) 2025 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <pthreaddef.h>
+#include <setjmp.h>
+#include <stackinfo.h>
+#include <stdio.h>
+#include <support/check.h>
+#include <support/test-driver.h>
+#include <support/xsignal.h>
+#include <support/xthread.h>
+#include <support/xunistd.h>
+#include <sys/mman.h>
+#include <stdlib.h>
+
+static long int pagesz;
+
+/* To check if the guard region is inaccessible, the thread tries read/writes
+   on it and checks if a SIGSEGV is generated.  */
+
+static volatile sig_atomic_t signal_jump_set;
+static sigjmp_buf signal_jmp_buf;
+
+static void
+sigsegv_handler (int sig)
+{
+  if (signal_jump_set == 0)
+    return;
+
+  siglongjmp (signal_jmp_buf, sig);
+}
+
+static bool
+try_access_buf (char *ptr, bool write)
+{
+  signal_jump_set = true;
+
+  bool failed = sigsetjmp (signal_jmp_buf, 0) != 0;
+  if (!failed)
+    {
+      if (write)
+	*(volatile char *)(ptr) = 'x';
+      else
+	*(volatile char *)(ptr);
+    }
+
+  signal_jump_set = false;
+  return !failed;
+}
+
+static bool
+try_read_buf (char *ptr)
+{
+  return try_access_buf (ptr, false);
+}
+
+static bool
+try_write_buf (char *ptr)
+{
+  return try_access_buf (ptr, true);
+}
+
+static bool
+try_read_write_buf (char *ptr)
+{
+  return try_read_buf (ptr) && try_write_buf(ptr);
+}
+
+
+/* Return the guard region of the current thread (it only makes sense on
+   a thread created by pthread_created).  */
+
+struct stack_t
+{
+  char *stack;
+  size_t stacksize;
+  char *guard;
+  size_t guardsize;
+};
+
+static inline size_t
+adjust_stacksize (size_t stacksize)
+{
+  /* For some ABIs, The guard page depends of the thread descriptor, which in
+     turn rely  on the require static TLS.  The only supported _STACK_GROWS_UP
+     ABI, hppa, defines TLS_DTV_AT_TP and it is not straightforward to
+     calculate the guard region with current pthread APIs.  So to get a
+     correct stack size assumes an extra page after the guard area.  */
+#if _STACK_GROWS_DOWN
+  return stacksize;
+#elif _STACK_GROWS_UP
+  return stacksize - pagesz;
+#endif
+}
+
+struct stack_t
+get_current_stack_info (void)
+{
+  pthread_attr_t attr;
+  TEST_VERIFY_EXIT (pthread_getattr_np (pthread_self (), &attr) == 0);
+  void *stack;
+  size_t stacksize;
+  TEST_VERIFY_EXIT (pthread_attr_getstack (&attr, &stack, &stacksize) == 0);
+  size_t guardsize;
+  TEST_VERIFY_EXIT (pthread_attr_getguardsize (&attr, &guardsize) == 0);
+  /* The guardsize is reported as the current page size, although it might
+     be adjusted to a larger value (aarch64 for instance).  */
+  if (guardsize != 0 && guardsize < ARCH_MIN_GUARD_SIZE)
+    guardsize = ARCH_MIN_GUARD_SIZE;
+
+#if _STACK_GROWS_DOWN
+  void *guard = guardsize ? stack - guardsize : 0;
+#elif _STACK_GROWS_UP
+  stacksize = adjust_stacksize (stacksize);
+  void *guard = guardsize ? stack + stacksize  : 0;
+#endif
+
+  pthread_attr_destroy (&attr);
+
+  return (struct stack_t) { stack, stacksize, guard, guardsize };
+}
+
+struct thread_args_t
+{
+  size_t stacksize;
+  size_t guardsize;
+};
+
+struct thread_args_t
+get_thread_args (const pthread_attr_t *attr)
+{
+  size_t stacksize;
+  size_t guardsize;
+
+  TEST_COMPARE (pthread_attr_getstacksize (attr, &stacksize), 0);
+  TEST_COMPARE (pthread_attr_getguardsize (attr, &guardsize), 0);
+  if (guardsize < ARCH_MIN_GUARD_SIZE)
+    guardsize = ARCH_MIN_GUARD_SIZE;
+
+  return (struct thread_args_t) { stacksize, guardsize };
+}
+
+static void
+set_thread_args (pthread_attr_t *attr, const struct thread_args_t *args)
+{
+  xpthread_attr_setstacksize (attr, args->stacksize);
+  xpthread_attr_setguardsize (attr, args->guardsize);
+}
+
+static void *
+tf (void *closure)
+{
+  struct thread_args_t *args = closure;
+
+  struct stack_t s = get_current_stack_info ();
+  if (test_verbose)
+    printf ("debug: [tid=%jd] stack = { .stack=%p, stacksize=%#zx, guard=%p, "
+	    "guardsize=%#zx }\n",
+	    (intmax_t) gettid (),
+	    s.stack,
+	    s.stacksize,
+	    s.guard,
+	    s.guardsize);
+
+  if (args != NULL)
+    {
+      TEST_COMPARE (adjust_stacksize (args->stacksize), s.stacksize);
+      TEST_COMPARE (args->guardsize, s.guardsize);
+    }
+
+  /* Ensure we can access the stack area.  */
+  TEST_COMPARE (try_read_write_buf (s.stack), true);
+  TEST_COMPARE (try_read_write_buf (&s.stack[s.stacksize / 2]), true);
+  TEST_COMPARE (try_read_write_buf (&s.stack[s.stacksize - 1]), true);
+
+  /* Check if accessing the guard area results in SIGSEGV.  */
+  if (s.guardsize > 0)
+    {
+      TEST_COMPARE (try_read_write_buf (s.guard), false);
+      TEST_COMPARE (try_read_write_buf (&s.guard[s.guardsize / 2]), false);
+      TEST_COMPARE (try_read_write_buf (&s.guard[s.guardsize] - 1), false);
+    }
+
+  return NULL;
+}
+
+/* Test 1: caller provided stack without guard.  */
+static void
+do_test1 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+
+  size_t stacksize = support_small_thread_stack_size ();
+  void *stack = xmmap (0,
+		       stacksize,
+		       PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
+		       -1);
+  xpthread_attr_setstack (&attr, stack, stacksize);
+  xpthread_attr_setguardsize (&attr, 0);
+
+  struct thread_args_t args = { stacksize, 0 };
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+  xmunmap (stack, stacksize);
+}
+
+/* Test 2: same as 1., but with a guard area.  */
+static void
+do_test2 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+
+  size_t stacksize = support_small_thread_stack_size ();
+  void *stack = xmmap (0,
+		       stacksize,
+		       PROT_READ | PROT_WRITE,
+		       MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK,
+		       -1);
+  xpthread_attr_setstack (&attr, stack, stacksize);
+  xpthread_attr_setguardsize (&attr, pagesz);
+
+  struct thread_args_t args = { stacksize, 0 };
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+  xmunmap (stack, stacksize);
+}
+
+/* Test 3: pthread_create with default values.  */
+static void
+do_test3 (void)
+{
+  pthread_t t = xpthread_create (NULL, tf, NULL);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+}
+
+/* Test 4: pthread_create without a guard area.  */
+static void
+do_test4 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.stacksize += args.guardsize;
+  args.guardsize = 0;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 5: pthread_create with non default stack and guard size value.  */
+static void
+do_test5 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.guardsize += pagesz;
+  args.stacksize += pagesz;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 6: thread with the required size (stack + guard) that matches the
+   test 3, but with largert guard area.  The pthread_create will need to
+   increase the guard area.  */
+static void
+do_test6 (void)
+{
+  pthread_attr_t attr;
+  xpthread_attr_init (&attr);
+  struct thread_args_t args = get_thread_args (&attr);
+  args.guardsize += pagesz;
+  args.stacksize -= pagesz;
+  set_thread_args (&attr, &args);
+
+  pthread_t t = xpthread_create (&attr, tf, &args);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+
+  xpthread_attr_destroy (&attr);
+}
+
+/* Test 7: pthread_create with default values, the requires size matches the
+   one from test 3 and 6 (but with a reduced guard ares).  The
+   pthread_create should use the cached stack from previous tests, but it
+   would require to reduce the guard area.  */
+static void
+do_test7 (void)
+{
+  pthread_t t = xpthread_create (NULL, tf, NULL);
+  void *status = xpthread_join (t);
+  TEST_VERIFY (status == 0);
+}
+
+static int
+do_test (void)
+{
+  pagesz = sysconf (_SC_PAGESIZE);
+
+  {
+    struct sigaction sa = {
+      .sa_handler = sigsegv_handler,
+      .sa_flags = SA_NODEFER,
+    };
+    sigemptyset (&sa.sa_mask);
+    xsigaction (SIGSEGV, &sa, NULL);
+  }
+
+  static const struct {
+    const char *descr;
+    void (*test)(void);
+  } tests[] = {
+    { "user provided stack without guard", do_test1 },
+    { "user provided stack with guard",    do_test2 },
+    { "default attribute",                 do_test3 },
+    { "default attribute without guard",   do_test4 },
+    { "non default stack and guard sizes", do_test5 },
+    { "reused stack with larger guard",    do_test6 },
+    { "reused stack with smaller guard",   do_test7 },
+  };
+
+  for (int i = 0; i < array_length (tests); i++)
+    {
+      printf ("debug: test%01d: %s\n", i, tests[i].descr);
+      tests[i].test();
+    }
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index c57738e9f3..20cc9202ec 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -72,7 +72,7 @@ __tls_init_tp (void)
    /* Early initialization of the TCB.   */
    pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid);
    THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
-   THREAD_SETMEM (pd, user_stack, true);
+   THREAD_SETMEM (pd, stack_mode, ALLOCATE_GUARD_USER);
 
   /* Before initializing GL (dl_stack_user), the debugger could not
      find us and had to set __nptl_initial_report_events.  Propagate
diff --git a/sysdeps/nptl/fork.h b/sysdeps/nptl/fork.h
index 6156af79e1..3c79179437 100644
--- a/sysdeps/nptl/fork.h
+++ b/sysdeps/nptl/fork.h
@@ -155,7 +155,7 @@ reclaim_stacks (void)
   INIT_LIST_HEAD (&GL (dl_stack_used));
   INIT_LIST_HEAD (&GL (dl_stack_user));
 
-  if (__glibc_unlikely (THREAD_GETMEM (self, user_stack)))
+  if (__glibc_unlikely (self->stack_mode == ALLOCATE_GUARD_USER))
     list_add (&self->list, &GL (dl_stack_user));
   else
     list_add (&self->list, &GL (dl_stack_used));
diff --git a/sysdeps/unix/sysv/linux/bits/mman-linux.h b/sysdeps/unix/sysv/linux/bits/mman-linux.h
index 8e072eb4cd..fe0496d802 100644
--- a/sysdeps/unix/sysv/linux/bits/mman-linux.h
+++ b/sysdeps/unix/sysv/linux/bits/mman-linux.h
@@ -113,6 +113,8 @@
 				    locked pages too.  */
 # define MADV_COLLAPSE    25	/* Synchronous hugepage collapse.  */
 # define MADV_HWPOISON	  100	/* Poison a page for testing.  */
+# define MADV_GUARD_INSTALL 102 /* Fatal signal on access to range */
+# define MADV_GUARD_REMOVE 103  /* Unguard range */
 #endif
 
 /* The POSIX people had to invent similar names for the same things.  */