From patchwork Wed Nov 13 19:28:22 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
X-Patchwork-Id: 179345
Delivered-To: patch@linaro.org
Received: by 2002:a92:38d5:0:0:0:0:0 with SMTP id g82csp10039811ilf;
 Wed, 13 Nov 2019 11:29:12 -0800 (PST)
X-Google-Smtp-Source: APXvYqwd2a230HGlFaQArmXflo0nyQQJygcqrh0AlJPR3dM0/CsA9ZZcK99zs4OFa8OXw3gOm9HI
X-Received: by 2002:a50:d7c9:: with SMTP id m9mr5639415edj.93.1573673352727; 
 Wed, 13 Nov 2019 11:29:12 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1573673352; cv=none;
 d=google.com; s=arc-20160816;
 b=bO9MPVCE/0YW1f0xeUWDZGO51N9SJwheztaKkF+ftDegqnbmiysjJV6kRAMsfnDBaE
 lWcpQuXgWMKGFTwq/puLPj6Nrg1dHqVDjhxwxcPEPS+THQ/VDdJ+/1XI4yN/zNdfzcsY
 HY2vfhih57BUvqhhgFxv0ak+bTVNeS70YJO/cGNADPdFgWtE+ooz7KNtmpUlOLJmeYIy
 sapWiNKNjAreZO60GPXdrJMMBruwwrJPE09TJcPqGzfxWAPKTAzLpW/NK9ESazMMegdu
 VEUWOdlBGYy/T0kB4evbc4SbLhFHnGXyzn6EsjU/aN0AMnXgF0WQ9JsTK0cr+8nAzu1Z
 gb3Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816; 
 h=references:in-reply-to:message-id:date:subject:cc:to:from
 :dkim-signature:delivered-to:sender:list-help:list-post:list-archive
 :list-subscribe:list-unsubscribe:list-id:precedence:mailing-list
 :dkim-signature:domainkey-signature;
 bh=lzgYtA+FhdhfIhqOOHETcOpnLox4aZ9XzGS+P3IFsVA=;
 b=WAkMlWx+DR5e8gm1owr5OVH1HaQ4Iy4E7KbBpN8y2YMj8so6+l+rFzin9dc8zDG8Y3
 hHL1tpbqwWNYKgctSpOox4qz5Emw8aJuTm9HxvzISIwI4zai0by6/GKqvmwMOoYchL/r
 hGJ1U1GWEDWxL9EqOWM1PjP76qm1PS8CeeBgPt8JkmFE8VYqNJthTkJi8pBjo6s3hevw
 KBoapgmv8/5d5+EbI87VrcVszGuXsKpg+SySNhjTeNrZFYdWkjSa0NJNYijdUvgu43db
 oEvdukGXJh+WhzD3W2PkPdak0KRj7SqfjmITARAuDD8jLInWaM9WSqvFNIFEXEx3uZcW
 2JKg==
ARC-Authentication-Results: i=1; mx.google.com;
 dkim=pass header.i=@sourceware.org header.s=default header.b=udFQ9ElT;
 dkim=pass header.i=@linaro.org header.s=google header.b=UDromiMV;
 spf=pass (google.com: domain of
 libc-alpha-return-107057-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 smtp.mailfrom="libc-alpha-return-107057-patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <libc-alpha-return-107057-patch=linaro.org@sourceware.org>
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 i24si1937611edr.311.2019.11.13.11.29.11 for <patch@linaro.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 13 Nov 2019 11:29:12 -0800 (PST)
Received-SPF: pass (google.com: domain of
 libc-alpha-return-107057-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Authentication-Results: mx.google.com;
 dkim=pass header.i=@sourceware.org header.s=default header.b=udFQ9ElT;
 dkim=pass header.i=@linaro.org header.s=google header.b=UDromiMV;
 spf=pass (google.com: domain of
 libc-alpha-return-107057-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 smtp.mailfrom="libc-alpha-return-107057-patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
 :list-unsubscribe:list-subscribe:list-archive:list-post
 :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to
 :references; q=dns; s=default; b=j79WEhyv/b7SrGqQ5lB0fFGdOjJF0cS
 VrP1Ql51vku/slavk/s7Ghrl9zk4qhFqcximttDJuP363Fs5/5QVrSAYosfAAGpQ
 Y75prBDJokBRQX1m5H5rKnThENVGBNb8x4G6cExowCzYy9+r/NfVHBwPnHi2MtaW
 D5UCV7o7rXZs=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
 :list-unsubscribe:list-subscribe:list-archive:list-post
 :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to
 :references; s=default; bh=kOdhJ6Dqc+TfOoTvKyqRPAEteNU=; b=udFQ9
 ElTGXu6xWFWtJmQ6jIcsl/nVHexRQrhWn4M9ANAhE05etXyRAHZ0vr3gc6SNZDwJ
 bGvc+zHLr26tlam2PC7XWxa5AxzkNcXbkBaihVVnYv6lReGb7u1ahJnOR6Qc48F3
 rxPcGfdNmG96mDCyh8LvncYFGkTXCkSGTmAzdM=
Received: (qmail 72045 invoked by alias); 13 Nov 2019 19:28:50 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-patch=linaro.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
 <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 71938 invoked by uid 89); 13 Nov 2019 19:28:49 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-21.2 required=5.0 tests=AWL, BAYES_00,
 GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
 KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE,
 SPF_PASS autolearn=ham version=3.3.1 spammy=m32,
 H*MI:sk:a9e59b0, H*f:sk:a9e59b0, H*i:sk:a9e59b0
X-HELO: mail-qk1-f179.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=lzgYtA+FhdhfIhqOOHETcOpnLox4aZ9XzGS+P3IFsVA=;
 b=UDromiMV9zeiCM/PGxA5ppQmMG8s/yznTXR2cAT/eGANvY2fBrxNsjotrrLpUWGjHh
 9xkBcts5zHYoL5EGwRBaSU4dGJp4PI6czUBq0JeGr/PKZyeOEU8OQF2m9n7Mo45O7+QG
 5p/5hhhsE1pAUp2O4EnN8CElCjHdHZKez3ca6k+xtZ24HJ4xvXwjsBMS7yuNZKCLd9N6
 YRhII+qV8KPffsJ2CKHb1TE2v9kGhbe2wzBAdt7mPJAnIhriqKFS2hL3h91+amZVBinn
 FrS3i8VV7cKZ9ghuELoAakFqkWTazv8lDvoM8eUemm2rlTeASVmDIzL9h+NhnJH65xw7
 ViWQ==
Return-Path: <adhemerval.zanella@linaro.org>
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Cc: Romain Naour <romain.naour@smile.fr>, Andreas Larsson <andreas@gaisler.com>
Subject: [PATCH 1/2] Remove 32 bit sparc v7 support
Date: Wed, 13 Nov 2019 16:28:22 -0300
Message-Id: <20191113192823.17109-1-adhemerval.zanella@linaro.org>
In-Reply-To: <a9e59b02-66e4-e81b-e6ec-90b9cc99b045@linaro.org>
References: <a9e59b02-66e4-e81b-e6ec-90b9cc99b045@linaro.org>

The patch is straighforward:

  - The sparc32 v8 implementations are moved as the generic ones.

  - A configure test is added to check for either __sparc_v8__ or
    __sparc_v9__.

  - The triple names are simplified and sparc implies in sparcv8.

The idea is to keep support on sparcv8 architectures that does support
CAS instructions, such as LEON3/LEON4.

Checked on a sparcv9-linux-gnu and sparc64-linux-gnu.
---
 scripts/build-many-glibcs.py             |   4 +-
 sysdeps/sparc/preconfigure               |  20 +-
 sysdeps/sparc/sparc32/Makefile           |  27 --
 sysdeps/sparc/sparc32/addmul_1.S         | 208 ++++++-------
 sysdeps/sparc/sparc32/configure          | 162 ++++++++++
 sysdeps/sparc/sparc32/configure.ac       |  13 +
 sysdeps/sparc/sparc32/divrem.m4          | 234 ---------------
 sysdeps/sparc/sparc32/dotmul.S           | 120 +-------
 sysdeps/sparc/sparc32/mul_1.S            | 244 +++++----------
 sysdeps/sparc/sparc32/rem.S              | 364 +---------------------
 sysdeps/sparc/sparc32/sdiv.S             | 365 +----------------------
 sysdeps/sparc/sparc32/sparcv8/Makefile   |   1 -
 sysdeps/sparc/sparc32/sparcv8/addmul_1.S | 118 --------
 sysdeps/sparc/sparc32/sparcv8/dotmul.S   |  13 -
 sysdeps/sparc/sparc32/sparcv8/mul_1.S    | 102 -------
 sysdeps/sparc/sparc32/sparcv8/rem.S      |  21 --
 sysdeps/sparc/sparc32/sparcv8/sdiv.S     |  20 --
 sysdeps/sparc/sparc32/sparcv8/submul_1.S |  57 ----
 sysdeps/sparc/sparc32/sparcv8/udiv.S     |  16 -
 sysdeps/sparc/sparc32/sparcv8/umul.S     |  13 -
 sysdeps/sparc/sparc32/sparcv8/urem.S     |  18 --
 sysdeps/sparc/sparc32/submul_1.S         | 143 ++-------
 sysdeps/sparc/sparc32/udiv.S             | 341 +--------------------
 sysdeps/sparc/sparc32/umul.S             | 148 +--------
 sysdeps/sparc/sparc32/urem.S             | 344 +--------------------
 25 files changed, 413 insertions(+), 2703 deletions(-)
 create mode 100644 sysdeps/sparc/sparc32/configure
 create mode 100644 sysdeps/sparc/sparc32/configure.ac
 delete mode 100644 sysdeps/sparc/sparc32/divrem.m4
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/Makefile
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/addmul_1.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/dotmul.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/mul_1.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/rem.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/sdiv.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/submul_1.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/udiv.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/umul.S
 delete mode 100644 sysdeps/sparc/sparc32/sparcv8/urem.S

-- 
2.17.1

diff --git a/scripts/build-many-glibcs.py b/scripts/build-many-glibcs.py
index 4b3c4bfb76..d4085d23dd 100755
--- a/scripts/build-many-glibcs.py
+++ b/scripts/build-many-glibcs.py
@@ -358,8 +358,10 @@ class Context(object):
         self.add_config(arch='sparc64',
                         os_name='linux-gnu',
                         glibcs=[{},
+                                {'arch': 'sparcv8',
+                                 'ccopts': '-m32 -mlong-double-128 -mcpu=leon3'}],
                                 {'arch': 'sparcv9',
-                                 'ccopts': '-m32 -mlong-double-128'}],
+                                 'ccopts': '-m32 -mlong-double-128 -mcpu=v9'}],
                         extra_glibcs=[{'variant': 'disable-multi-arch',
                                        'cfg': ['--disable-multi-arch']},
                                       {'variant': 'disable-multi-arch',
diff --git a/sysdeps/sparc/preconfigure b/sysdeps/sparc/preconfigure
index de86749573..3aa3df4ee3 100644
--- a/sysdeps/sparc/preconfigure
+++ b/sysdeps/sparc/preconfigure
@@ -1,24 +1,10 @@
 # preconfigure fragment for sparc.
 
 case "$machine" in
-sparc | sparcv[67])
+sparc | sparcv8 | superspac | hyperspace)
 		base_machine=sparc machine=sparc/sparc32 ;;
-sparcv8 | supersparc | hypersparc)
-		base_machine=sparc machine=sparc/sparc32/sparcv8 ;;
-sparcv8plus | sparcv8plusa | sparcv9)
+sparcv8plus | sparcv8plus* | sparcv9* )
 		base_machine=sparc machine=sparc/sparc32/sparcv9 ;;
-sparcv8plusb | sparcv9b)
-		base_machine=sparc machine=sparc/sparc32/sparcv9/sparcv9b ;;
-sparcv9v)
-		base_machine=sparc machine=sparc/sparc32/sparcv9/sparcv9v ;;
-sparcv9v2)
-		base_machine=sparc machine=sparc/sparc32/sparcv9/sparcv9v2 ;;
-sparc64)
+sparc64*)
 		base_machine=sparc machine=sparc/sparc64 ;;
-sparc64b)
-		base_machine=sparc machine=sparc/sparc64/sparcv9b ;;
-sparc64v)
-		base_machine=sparc machine=sparc/sparc64/sparcv9v ;;
-sparc64v2)
-		base_machine=sparc machine=sparc/sparc64/sparcv9v2 ;;
 esac
diff --git a/sysdeps/sparc/sparc32/Makefile b/sysdeps/sparc/sparc32/Makefile
index fe40f5cbbe..008cb81548 100644
--- a/sysdeps/sparc/sparc32/Makefile
+++ b/sysdeps/sparc/sparc32/Makefile
@@ -19,35 +19,8 @@ ifeq ($(subdir),gnulib)
 sysdep_routines = dotmul umul $(divrem) alloca
 endif	# gnulib
 
-# We distribute these files, even though they are generated,
-# so as to avoid the need for a functioning m4 to build the library.
 divrem := sdiv udiv rem urem
 
-+divrem-NAME-sdiv := div
-+divrem-NAME-udiv := udiv
-+divrem-NAME-rem := rem
-+divrem-NAME-urem := urem
-+divrem-NAME = $(+divrem-NAME-$(basename $(notdir $@)))
-+divrem-OP-div := div
-+divrem-OP-udiv := div
-+divrem-OP-rem := rem
-+divrem-OP-urem := rem
-+divrem-S-div := true
-+divrem-S-rem := true
-+divrem-S-udiv := false
-+divrem-S-urem := false
-$(divrem:%=$(sysdep_dir)/sparc/sparc32/%.S): $(sysdep_dir)/sparc/sparc32/divrem.m4
-	(echo -n "define(NAME,\`.$(+divrem-NAME)')"; \
-	 echo -n " define(OP,\`$(+divrem-OP-$(+divrem-NAME))')"; \
-	 echo -n " define(S,\`$(+divrem-S-$(+divrem-NAME))')"; \
-	 echo " /* This file is generated from divrem.m4; DO NOT EDIT! */"; \
-	 cat $<) | $(M4) > $@-tmp
-# Make it unwritable so noone will edit it by mistake.
-	-chmod a-w $@-tmp
-	mv -f $@-tmp $@
-
-sysdep-realclean := $(sysdep-realclean) $(divrem:%=sysdeps/sparc/sparc32/%.S)
-
 # libgcc __divdi3 and __moddi3 uses .udiv and since it is also exported by
 # libc.so linker will create PLTs for the symbol.  To avoid it we strong alias
 # the exported libc one to __wrap_.udiv and use linker option --wrap to make any
diff --git a/sysdeps/sparc/sparc32/addmul_1.S b/sysdeps/sparc/sparc32/addmul_1.S
index 1028f19b54..4708393aa5 100644
--- a/sysdeps/sparc/sparc32/addmul_1.S
+++ b/sysdeps/sparc/sparc32/addmul_1.S
@@ -1,146 +1,118 @@
-! SPARC __mpn_addmul_1 -- Multiply a limb vector with a limb and add
-! the result to a second limb vector.
-!
+! SPARC v8 __mpn_addmul_1 -- Multiply a limb vector with a limb and
+! add the result to a second limb vector.
+
 ! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-!
+
 ! This file is part of the GNU MP Library.
-!
+
 ! The GNU MP Library is free software; you can redistribute it and/or modify
 ! it under the terms of the GNU Lesser General Public License as published by
 ! the Free Software Foundation; either version 2.1 of the License, or (at your
 ! option) any later version.
-!
+
 ! The GNU MP Library is distributed in the hope that it will be useful, but
 ! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 ! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
 ! License for more details.
-!
+
 ! You should have received a copy of the GNU Lesser General Public License
 ! along with the GNU MP Library; see the file COPYING.LIB.  If not,
 ! see <https://www.gnu.org/licenses/>.
 
 
 ! INPUT PARAMETERS
-! RES_PTR	o0
-! S1_PTR	o1
-! SIZE		o2
-! S2_LIMB	o3
+! res_ptr	o0
+! s1_ptr	o1
+! size		o2
+! s2_limb	o3
 
 #include <sysdep.h>
 
 ENTRY(__mpn_addmul_1)
-	! Make S1_PTR and RES_PTR point at the end of their blocks
-	! and put (- 4 x SIZE) in index/loop counter.
-	sll	%o2,2,%o2
-	add	%o0,%o2,%o4	! RES_PTR in o4 since o0 is retval
-	add	%o1,%o2,%o1
-	sub	%g0,%o2,%o2
+	ld	[%o1+0],%o4	! 1
+	sll	%o2,4,%g1
+	orcc	%g0,%g0,%g2
+	mov	%o7,%g4			! Save return address register
+	and	%g1,(4-1)<<4,%g1
+1:	call	2f
+	 add	%o7,3f-1b,%g3
+2:	jmp	%g3+%g1
+	 mov	%g4,%o7			! Restore return address register
 
-	cmp	%o3,0xfff
-	bgu	LOC(large)
+	.align	4
+3:
+LOC(00):
+	add	%o0,-4,%o0
+	b	LOC(loop00)		/* 4, 8, 12, ... */
+	 add	%o1,-4,%o1
+	nop
+LOC(01):
+	b	LOC(loop01)		/* 1, 5, 9, ... */
+	 nop
+	nop
+	nop
+LOC(10):
+	add	%o0,-12,%o0	/* 2, 6, 10, ... */
+	b	LOC(loop10)
+	 add	%o1,4,%o1
+	nop
+LOC(11):
+	add	%o0,-8,%o0	/* 3, 7, 11, ... */
+	b	LOC(loop11)
+	 add	%o1,-8,%o1
 	nop
 
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	b	LOC(0)
-	 add	%o4,-4,%o4
-LOC(loop0):
-	addcc	%o5,%g1,%g1
-	ld	[%o1+%o2],%o5
-	addx	%o0,%g0,%o0
-	st	%g1,[%o4+%o2]
-LOC(0):	wr	%g0,%o3,%y
-	sra	%o5,31,%g2
-	and	%o3,%g2,%g2
-	andcc	%g1,0,%g1
-	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,0,%g1
-	sra	%g1,20,%g4
-	sll	%g1,12,%g1
- 	rd	%y,%g3
-	srl	%g3,20,%g3
-	or	%g1,%g3,%g1
-
-	addcc	%g1,%o0,%g1
-	addx	%g2,%g4,%o0	! add sign-compensation and cy to hi limb
-	addcc	%o2,4,%o2	! loop counter
-	bne	LOC(loop0)
-	 ld	[%o4+%o2],%o5
-
-	addcc	%o5,%g1,%g1
-	addx	%o0,%g0,%o0
-	retl
-	st	%g1,[%o4+%o2]
-
-
-LOC(large):
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	sra	%o3,31,%g4	! g4 = mask of ones iff S2_LIMB < 0
-	b	LOC(1)
-	 add	%o4,-4,%o4
 LOC(loop):
-	addcc	%o5,%g3,%g3
-	ld	[%o1+%o2],%o5
-	addx	%o0,%g0,%o0
-	st	%g3,[%o4+%o2]
-LOC(1):	wr	%g0,%o5,%y
-	and	%o5,%g4,%g2
-	andcc	%g0,%g0,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%g0,%g1
-	rd	%y,%g3
-	addcc	%g3,%o0,%g3
-	addx	%g2,%g1,%o0
-	addcc	%o2,4,%o2
-	bne	LOC(loop)
-	 ld	[%o4+%o2],%o5
+	addcc	%g3,%g2,%g3	! 1
+	ld	[%o1+4],%o4	! 2
+	rd	%y,%g2		! 1
+	addx	%g0,%g2,%g2
+	ld	[%o0+0],%g1	! 2
+	addcc	%g1,%g3,%g3
+	st	%g3,[%o0+0]	! 1
+LOC(loop00):
+	umul	%o4,%o3,%g3	! 2
+	ld	[%o0+4],%g1	! 2
+	addxcc	%g3,%g2,%g3	! 2
+	ld	[%o1+8],%o4	! 3
+	rd	%y,%g2		! 2
+	addx	%g0,%g2,%g2
+	nop
+	addcc	%g1,%g3,%g3
+	st	%g3,[%o0+4]	! 2
+LOC(loop11):
+	umul	%o4,%o3,%g3	! 3
+	addxcc	%g3,%g2,%g3	! 3
+	ld	[%o1+12],%o4	! 4
+	rd	%y,%g2		! 3
+	add	%o1,16,%o1
+	addx	%g0,%g2,%g2
+	ld	[%o0+8],%g1	! 2
+	addcc	%g1,%g3,%g3
+	st	%g3,[%o0+8]	! 3
+LOC(loop10):
+	umul	%o4,%o3,%g3	! 4
+	addxcc	%g3,%g2,%g3	! 4
+	ld	[%o1+0],%o4	! 1
+	rd	%y,%g2		! 4
+	addx	%g0,%g2,%g2
+	ld	[%o0+12],%g1	! 2
+	addcc	%g1,%g3,%g3
+	st	%g3,[%o0+12]	! 4
+	add	%o0,16,%o0
+	addx	%g0,%g2,%g2
+LOC(loop01):
+	addcc	%o2,-4,%o2
+	bg	LOC(loop)
+	 umul	%o4,%o3,%g3	! 1
 
-	addcc	%o5,%g3,%g3
-	addx	%o0,%g0,%o0
+	addcc	%g3,%g2,%g3	! 4
+	rd	%y,%g2		! 4
+	addx	%g0,%g2,%g2
+	ld	[%o0+0],%g1	! 2
+	addcc	%g1,%g3,%g3
+	st	%g3,[%o0+0]	! 4
 	retl
-	st	%g3,[%o4+%o2]
+	 addx	%g0,%g2,%o0
 
 END(__mpn_addmul_1)
diff --git a/sysdeps/sparc/sparc32/configure b/sysdeps/sparc/sparc32/configure
new file mode 100644
index 0000000000..337531fe94
--- /dev/null
+++ b/sysdeps/sparc/sparc32/configure
@@ -0,0 +1,162 @@
+# This file is generated from configure.ac by Autoconf.  DO NOT EDIT!
+ # Local configure fragment for sysdeps/sparc/sparc32
+
+# Test if compiler targets at least sparcv8.
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for grep that handles long lines and -e" >&5
+$as_echo_n "checking for grep that handles long lines and -e... " >&6; }
+if ${ac_cv_path_GREP+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if test -z "$GREP"; then
+  ac_path_GREP_found=false
+  # Loop through the user's path and test for each of PROGNAME-LIST
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_prog in grep ggrep; do
+    for ac_exec_ext in '' $ac_executable_extensions; do
+      ac_path_GREP="$as_dir/$ac_prog$ac_exec_ext"
+      as_fn_executable_p "$ac_path_GREP" || continue
+# Check for GNU ac_path_GREP and select it if it is found.
+  # Check for GNU $ac_path_GREP
+case `"$ac_path_GREP" --version 2>&1` in
+*GNU*)
+  ac_cv_path_GREP="$ac_path_GREP" ac_path_GREP_found=:;;
+*)
+  ac_count=0
+  $as_echo_n 0123456789 >"conftest.in"
+  while :
+  do
+    cat "conftest.in" "conftest.in" >"conftest.tmp"
+    mv "conftest.tmp" "conftest.in"
+    cp "conftest.in" "conftest.nl"
+    $as_echo 'GREP' >> "conftest.nl"
+    "$ac_path_GREP" -e 'GREP$' -e '-(cannot match)-' < "conftest.nl" >"conftest.out" 2>/dev/null || break
+    diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break
+    as_fn_arith $ac_count + 1 && ac_count=$as_val
+    if test $ac_count -gt ${ac_path_GREP_max-0}; then
+      # Best one so far, save it but keep looking for a better one
+      ac_cv_path_GREP="$ac_path_GREP"
+      ac_path_GREP_max=$ac_count
+    fi
+    # 10*(2^10) chars as input seems more than enough
+    test $ac_count -gt 10 && break
+  done
+  rm -f conftest.in conftest.tmp conftest.nl conftest.out;;
+esac
+
+      $ac_path_GREP_found && break 3
+    done
+  done
+  done
+IFS=$as_save_IFS
+  if test -z "$ac_cv_path_GREP"; then
+    as_fn_error $? "no acceptable grep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5
+  fi
+else
+  ac_cv_path_GREP=$GREP
+fi
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_GREP" >&5
+$as_echo "$ac_cv_path_GREP" >&6; }
+ GREP="$ac_cv_path_GREP"
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for egrep" >&5
+$as_echo_n "checking for egrep... " >&6; }
+if ${ac_cv_path_EGREP+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  if echo a | $GREP -E '(a|b)' >/dev/null 2>&1
+   then ac_cv_path_EGREP="$GREP -E"
+   else
+     if test -z "$EGREP"; then
+  ac_path_EGREP_found=false
+  # Loop through the user's path and test for each of PROGNAME-LIST
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH$PATH_SEPARATOR/usr/xpg4/bin
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+    for ac_prog in egrep; do
+    for ac_exec_ext in '' $ac_executable_extensions; do
+      ac_path_EGREP="$as_dir/$ac_prog$ac_exec_ext"
+      as_fn_executable_p "$ac_path_EGREP" || continue
+# Check for GNU ac_path_EGREP and select it if it is found.
+  # Check for GNU $ac_path_EGREP
+case `"$ac_path_EGREP" --version 2>&1` in
+*GNU*)
+  ac_cv_path_EGREP="$ac_path_EGREP" ac_path_EGREP_found=:;;
+*)
+  ac_count=0
+  $as_echo_n 0123456789 >"conftest.in"
+  while :
+  do
+    cat "conftest.in" "conftest.in" >"conftest.tmp"
+    mv "conftest.tmp" "conftest.in"
+    cp "conftest.in" "conftest.nl"
+    $as_echo 'EGREP' >> "conftest.nl"
+    "$ac_path_EGREP" 'EGREP$' < "conftest.nl" >"conftest.out" 2>/dev/null || break
+    diff "conftest.out" "conftest.nl" >/dev/null 2>&1 || break
+    as_fn_arith $ac_count + 1 && ac_count=$as_val
+    if test $ac_count -gt ${ac_path_EGREP_max-0}; then
+      # Best one so far, save it but keep looking for a better one
+      ac_cv_path_EGREP="$ac_path_EGREP"
+      ac_path_EGREP_max=$ac_count
+    fi
+    # 10*(2^10) chars as input seems more than enough
+    test $ac_count -gt 10 && break
+  done
+  rm -f conftest.in conftest.tmp conftest.nl conftest.out;;
+esac
+
+      $ac_path_EGREP_found && break 3
+    done
+  done
+  done
+IFS=$as_save_IFS
+  if test -z "$ac_cv_path_EGREP"; then
+    as_fn_error $? "no acceptable egrep could be found in $PATH$PATH_SEPARATOR/usr/xpg4/bin" "$LINENO" 5
+  fi
+else
+  ac_cv_path_EGREP=$EGREP
+fi
+
+   fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_path_EGREP" >&5
+$as_echo "$ac_cv_path_EGREP" >&6; }
+ EGREP="$ac_cv_path_EGREP"
+
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for at least sparcv8 support" >&5
+$as_echo_n "checking for at least sparcv8 support... " >&6; }
+if ${libc_cv_sparcv8+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#if defined (__sparc_v8__) || defined (__sparc_v9__)
+                      yes
+                     #endif
+
+_ACEOF
+if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
+  $EGREP "yes" >/dev/null 2>&1; then :
+  libc_cv_sparcv8=yes
+else
+  libc_sparcv8=no
+fi
+rm -f conftest*
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_sparcv8" >&5
+$as_echo "$libc_cv_sparcv8" >&6; }
+if test $libc_cv_sparcv8 = no; then
+  as_fn_error $? "no support for pre-v8 sparc" "$LINENO" 5
+fi
diff --git a/sysdeps/sparc/sparc32/configure.ac b/sysdeps/sparc/sparc32/configure.ac
new file mode 100644
index 0000000000..99229bfadb
--- /dev/null
+++ b/sysdeps/sparc/sparc32/configure.ac
@@ -0,0 +1,13 @@
+GLIBC_PROVIDES dnl See aclocal.m4 in the top level source directory.
+# Local configure fragment for sysdeps/sparc/sparc32
+
+# Test if compiler targets at least sparcv8.
+AC_CACHE_CHECK([for at least sparcv8 support],
+  [libc_cv_sparcv8],
+  [AC_EGREP_CPP(yes,[#if defined (__sparc_v8__) || defined (__sparc_v9__)
+                      yes
+                     #endif
+  ], libc_cv_sparcv8=yes, libc_sparcv8=no)])
+if test $libc_cv_sparcv8 = no; then
+  AC_MSG_ERROR([no support for pre-v8 sparc])
+fi
diff --git a/sysdeps/sparc/sparc32/divrem.m4 b/sysdeps/sparc/sparc32/divrem.m4
deleted file mode 100644
index c08c530020..0000000000
--- a/sysdeps/sparc/sparc32/divrem.m4
+++ /dev/null
@@ -1,234 +0,0 @@
-/*
- * Division and remainder, from Appendix E of the Sparc Version 8
- * Architecture Manual, with fixes from Gordon Irlam.
- */
-
-/*
- * Input: dividend and divisor in %o0 and %o1 respectively.
- *
- * m4 parameters:
- *  NAME	name of function to generate
- *  OP		OP=div => %o0 / %o1; OP=rem => %o0 % %o1
- *  S		S=true => signed; S=false => unsigned
- *
- * Algorithm parameters:
- *  N		how many bits per iteration we try to get (4)
- *  WORDSIZE	total number of bits (32)
- *
- * Derived constants:
- *  TOPBITS	number of bits in the top `decade' of a number
- *
- * Important variables:
- *  Q		the partial quotient under development (initially 0)
- *  R		the remainder so far, initially the dividend
- *  ITER	number of main division loop iterations required;
- *		equal to ceil(log2(quotient) / N).  Note that this
- *		is the log base (2^N) of the quotient.
- *  V		the current comparand, initially divisor*2^(ITER*N-1)
- *
- * Cost:
- *  Current estimate for non-large dividend is
- *	ceil(log2(quotient) / N) * (10 + 7N/2) + C
- *  A large dividend is one greater than 2^(31-TOPBITS) and takes a
- *  different path, as the upper bits of the quotient must be developed
- *  one bit at a time.
- */
-
-define(N, `4')dnl
-define(WORDSIZE, `32')dnl
-define(TOPBITS, eval(WORDSIZE - N*((WORDSIZE-1)/N)))dnl
-dnl
-define(dividend, `%o0')dnl
-define(divisor, `%o1')dnl
-define(Q, `%o2')dnl
-define(R, `%o3')dnl
-define(ITER, `%o4')dnl
-define(V, `%o5')dnl
-dnl
-dnl m4 reminder: ifelse(a,b,c,d) => if a is b, then c, else d
-define(T, `%g1')dnl
-define(SC, `%g2')dnl
-ifelse(S, `true', `define(SIGN, `%g3')')dnl
-
-dnl
-dnl This is the recursive definition for developing quotient digits.
-dnl
-dnl Parameters:
-dnl  $1	the current depth, 1 <= $1 <= N
-dnl  $2	the current accumulation of quotient bits
-dnl  N	max depth
-dnl
-dnl We add a new bit to $2 and either recurse or insert the bits in
-dnl the quotient.  R, Q, and V are inputs and outputs as defined above;
-dnl the condition codes are expected to reflect the input R, and are
-dnl modified to reflect the output R.
-dnl
-define(DEVELOP_QUOTIENT_BITS,
-`	! depth $1, accumulated bits $2
-	bl	LOC($1.eval(2**N+$2))
-	srl	V,1,V
-	! remainder is positive
-	subcc	R,V,R
-	ifelse($1, N,
-	`	b	9f
-		add	Q, ($2*2+1), Q
-', `	DEVELOP_QUOTIENT_BITS(incr($1), `eval(2*$2+1)')')
-LOC($1.eval(2**N+$2)):
-	! remainder is negative
-	addcc	R,V,R
-	ifelse($1, N,
-	`	b	9f
-		add	Q, ($2*2-1), Q
-', `	DEVELOP_QUOTIENT_BITS(incr($1), `eval(2*$2-1)')')
-ifelse($1, 1, `9:')')dnl
-
-#include <sysdep.h>
-#include <sys/trap.h>
-
-ENTRY(NAME)
-ifelse(S, `true',
-`	! compute sign of result; if neither is negative, no problem
-	orcc	divisor, dividend, %g0	! either negative?
-	bge	2f			! no, go do the divide
-ifelse(OP, `div',
-`	xor	divisor, dividend, SIGN	! compute sign in any case',
-`	mov	dividend, SIGN		! sign of remainder matches dividend')
-	tst	divisor
-	bge	1f
-	tst	dividend
-	! divisor is definitely negative; dividend might also be negative
-	bge	2f			! if dividend not negative...
-	sub	%g0, divisor, divisor	! in any case, make divisor nonneg
-1:	! dividend is negative, divisor is nonnegative
-	sub	%g0, dividend, dividend	! make dividend nonnegative
-2:
-')
-	! Ready to divide.  Compute size of quotient; scale comparand.
-	orcc	divisor, %g0, V
-	bne	1f
-	mov	dividend, R
-
-		! Divide by zero trap.  If it returns, return 0 (about as
-		! wrong as possible, but that is what SunOS does...).
-		ta	ST_DIV0
-		retl
-		clr	%o0
-
-1:
-	cmp	R, V			! if divisor exceeds dividend, done
-	blu	LOC(got_result)		! (and algorithm fails otherwise)
-	clr	Q
-	sethi	%hi(1 << (WORDSIZE - TOPBITS - 1)), T
-	cmp	R, T
-	blu	LOC(not_really_big)
-	clr	ITER
-
-	! `Here the dividend is >= 2**(31-N) or so.  We must be careful here,
-	! as our usual N-at-a-shot divide step will cause overflow and havoc.
-	! The number of bits in the result here is N*ITER+SC, where SC <= N.
-	! Compute ITER in an unorthodox manner: know we need to shift V into
-	! the top decade: so do not even bother to compare to R.'
-	1:
-		cmp	V, T
-		bgeu	3f
-		mov	1, SC
-		sll	V, N, V
-		b	1b
-		add	ITER, 1, ITER
-
-	! Now compute SC.
-	2:	addcc	V, V, V
-		bcc	LOC(not_too_big)
-		add	SC, 1, SC
-
-		! We get here if the divisor overflowed while shifting.
-		! This means that R has the high-order bit set.
-		! Restore V and subtract from R.
-		sll	T, TOPBITS, T	! high order bit
-		srl	V, 1, V		! rest of V
-		add	V, T, V
-		b	LOC(do_single_div)
-		sub	SC, 1, SC
-
-	LOC(not_too_big):
-	3:	cmp	V, R
-		blu	2b
-		nop
-		be	LOC(do_single_div)
-		nop
-	/* NB: these are commented out in the V8-Sparc manual as well */
-	/* (I do not understand this) */
-	! V > R: went too far: back up 1 step
-	!	srl	V, 1, V
-	!	dec	SC
-	! do single-bit divide steps
-	!
-	! We have to be careful here.  We know that R >= V, so we can do the
-	! first divide step without thinking.  BUT, the others are conditional,
-	! and are only done if R >= 0.  Because both R and V may have the high-
-	! order bit set in the first step, just falling into the regular
-	! division loop will mess up the first time around.
-	! So we unroll slightly...
-	LOC(do_single_div):
-		subcc	SC, 1, SC
-		bl	LOC(end_regular_divide)
-		nop
-		sub	R, V, R
-		mov	1, Q
-		b	LOC(end_single_divloop)
-		nop
-	LOC(single_divloop):
-		sll	Q, 1, Q
-		bl	1f
-		srl	V, 1, V
-		! R >= 0
-		sub	R, V, R
-		b	2f
-		add	Q, 1, Q
-	1:	! R < 0
-		add	R, V, R
-		sub	Q, 1, Q
-	2:
-	LOC(end_single_divloop):
-		subcc	SC, 1, SC
-		bge	LOC(single_divloop)
-		tst	R
-		b,a	LOC(end_regular_divide)
-
-LOC(not_really_big):
-1:
-	sll	V, N, V
-	cmp	V, R
-	bleu	1b
-	addcc	ITER, 1, ITER
-	be	LOC(got_result)
-	sub	ITER, 1, ITER
-
-	tst	R	! set up for initial iteration
-LOC(divloop):
-	sll	Q, N, Q
-	DEVELOP_QUOTIENT_BITS(1, 0)
-LOC(end_regular_divide):
-	subcc	ITER, 1, ITER
-	bge	LOC(divloop)
-	tst	R
-	bl,a	LOC(got_result)
-	! non-restoring fixup here (one instruction only!)
-ifelse(OP, `div',
-`	sub	Q, 1, Q
-', `	add	R, divisor, R
-')
-
-LOC(got_result):
-ifelse(S, `true',
-`	! check to see if answer should be < 0
-	tst	SIGN
-	bl,a	1f
-	ifelse(OP, `div', `sub %g0, Q, Q', `sub %g0, R, R')
-1:')
-	retl
-	ifelse(OP, `div', `mov Q, %o0', `mov R, %o0')
-
-END(NAME)
-ifelse(OP, `div', ifelse(S, `false', `strong_alias (.udiv, __wrap_.udiv)
-'))dnl
diff --git a/sysdeps/sparc/sparc32/dotmul.S b/sysdeps/sparc/sparc32/dotmul.S
index d497ca672d..9b20cc3684 100644
--- a/sysdeps/sparc/sparc32/dotmul.S
+++ b/sysdeps/sparc/sparc32/dotmul.S
@@ -1,127 +1,13 @@
 /*
- * Signed multiply, from Appendix E of the Sparc Version 8
- * Architecture Manual.
- */
-
-/*
- * Returns %o0 * %o1 in %o1%o0 (i.e., %o1 holds the upper 32 bits of
- * the 64-bit product).
- *
- * This code optimizes short (less than 13-bit) multiplies.
+ * Sparc v8 has multiply.
  */
 
 #include <sysdep.h>
 
-
 ENTRY(.mul)
-	mov	%o0, %y		! multiplier -> Y
-	andncc	%o0, 0xfff, %g0	! test bits 12..31
-	be	LOC(mul_shortway)	! if zero, can do it the short way
-	andcc	%g0, %g0, %o4	! zero the partial product and clear N and V
-
-	/*
-	 * Long multiply.  32 steps, followed by a final shift step.
-	 */
-	mulscc	%o4, %o1, %o4	! 1
-	mulscc	%o4, %o1, %o4	! 2
-	mulscc	%o4, %o1, %o4	! 3
-	mulscc	%o4, %o1, %o4	! 4
-	mulscc	%o4, %o1, %o4	! 5
-	mulscc	%o4, %o1, %o4	! 6
-	mulscc	%o4, %o1, %o4	! 7
-	mulscc	%o4, %o1, %o4	! 8
-	mulscc	%o4, %o1, %o4	! 9
-	mulscc	%o4, %o1, %o4	! 10
-	mulscc	%o4, %o1, %o4	! 11
-	mulscc	%o4, %o1, %o4	! 12
-	mulscc	%o4, %o1, %o4	! 13
-	mulscc	%o4, %o1, %o4	! 14
-	mulscc	%o4, %o1, %o4	! 15
-	mulscc	%o4, %o1, %o4	! 16
-	mulscc	%o4, %o1, %o4	! 17
-	mulscc	%o4, %o1, %o4	! 18
-	mulscc	%o4, %o1, %o4	! 19
-	mulscc	%o4, %o1, %o4	! 20
-	mulscc	%o4, %o1, %o4	! 21
-	mulscc	%o4, %o1, %o4	! 22
-	mulscc	%o4, %o1, %o4	! 23
-	mulscc	%o4, %o1, %o4	! 24
-	mulscc	%o4, %o1, %o4	! 25
-	mulscc	%o4, %o1, %o4	! 26
-	mulscc	%o4, %o1, %o4	! 27
-	mulscc	%o4, %o1, %o4	! 28
-	mulscc	%o4, %o1, %o4	! 29
-	mulscc	%o4, %o1, %o4	! 30
-	mulscc	%o4, %o1, %o4	! 31
-	mulscc	%o4, %o1, %o4	! 32
-	mulscc	%o4, %g0, %o4	! final shift
-
-	! If %o0 was negative, the result is
-	!	(%o0 * %o1) + (%o1 << 32))
-	! We fix that here.
-
-#if 0
-	tst	%o0
-	bge	1f
-	rd	%y, %o0
-
-	! %o0 was indeed negative; fix upper 32 bits of result by subtracting
-	! %o1 (i.e., return %o4 - %o1 in %o1).
-	retl
-	sub	%o4, %o1, %o1
-
-1:
-	retl
-	mov	%o4, %o1
-#else
-	/* Faster code adapted from tege@sics.se's code for umul.S.  */
-	sra	%o0, 31, %o2	! make mask from sign bit
-	and	%o1, %o2, %o2	! %o2 = 0 or %o1, depending on sign of %o0
-	rd	%y, %o0		! get lower half of product
-	retl
-	sub	%o4, %o2, %o1	! subtract compensation
-				!  and put upper half in place
-#endif
-
-LOC(mul_shortway):
-	/*
-	 * Short multiply.  12 steps, followed by a final shift step.
-	 * The resulting bits are off by 12 and (32-12) = 20 bit positions,
-	 * but there is no problem with %o0 being negative (unlike above).
-	 */
-	mulscc	%o4, %o1, %o4	! 1
-	mulscc	%o4, %o1, %o4	! 2
-	mulscc	%o4, %o1, %o4	! 3
-	mulscc	%o4, %o1, %o4	! 4
-	mulscc	%o4, %o1, %o4	! 5
-	mulscc	%o4, %o1, %o4	! 6
-	mulscc	%o4, %o1, %o4	! 7
-	mulscc	%o4, %o1, %o4	! 8
-	mulscc	%o4, %o1, %o4	! 9
-	mulscc	%o4, %o1, %o4	! 10
-	mulscc	%o4, %o1, %o4	! 11
-	mulscc	%o4, %o1, %o4	! 12
-	mulscc	%o4, %g0, %o4	! final shift
-
-	/*
-	 *  %o4 has 20 of the bits that should be in the low part of the
-	 * result; %y has the bottom 12 (as %y's top 12).  That is:
-	 *
-	 *	  %o4		    %y
-	 * +----------------+----------------+
-	 * | -12- |   -20-  | -12- |   -20-  |
-	 * +------(---------+------)---------+
-	 *  --hi-- ----low-part----
-	 *
-	 * The upper 12 bits of %o4 should be sign-extended to form the
-	 * high part of the product (i.e., highpart = %o4 >> 20).
-	 */
 
-	rd	%y, %o5
-	sll	%o4, 12, %o0	! shift middle bits left 12
-	srl	%o5, 20, %o5	! shift low bits right 20, zero fill at left
-	or	%o5, %o0, %o0	! construct low part of result
+	smul	%o0, %o1, %o0
 	retl
-	sra	%o4, 20, %o1	! ... and extract high part of result
+	 rd	%y, %o1
 
 END(.mul)
diff --git a/sysdeps/sparc/sparc32/mul_1.S b/sysdeps/sparc/sparc32/mul_1.S
index acce57ec12..86619c71d6 100644
--- a/sysdeps/sparc/sparc32/mul_1.S
+++ b/sysdeps/sparc/sparc32/mul_1.S
@@ -1,198 +1,102 @@
-! SPARC __mpn_mul_1 -- Multiply a limb vector with a limb and store
-! the result in a second limb vector.
-!
+! SPARC v8 __mpn_mul_1 -- Multiply a limb vector with a single limb and
+! store the product in a second limb vector.
+
 ! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-!
+
 ! This file is part of the GNU MP Library.
-!
+
 ! The GNU MP Library is free software; you can redistribute it and/or modify
 ! it under the terms of the GNU Lesser General Public License as published by
 ! the Free Software Foundation; either version 2.1 of the License, or (at your
 ! option) any later version.
-!
+
 ! The GNU MP Library is distributed in the hope that it will be useful, but
 ! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 ! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
 ! License for more details.
-!
+
 ! You should have received a copy of the GNU Lesser General Public License
 ! along with the GNU MP Library; see the file COPYING.LIB.  If not,
 ! see <https://www.gnu.org/licenses/>.
 
 
 ! INPUT PARAMETERS
-! RES_PTR	o0
-! S1_PTR	o1
-! SIZE		o2
-! S2_LIMB	o3
-
-! ADD CODE FOR SMALL MULTIPLIERS!
-!1:	ld
-!	st
-!
-!2:	ld	,a
-!	addxcc	a,a,x
-!	st	x,
-!
-!3_unrolled:
-!	ld	,a
-!	addxcc	a,a,x1		! 2a + cy
-!	addx	%g0,%g0,x2
-!	addcc	a,x1,x		! 3a + c
-!	st	x,
-!
-!	ld	,a
-!	addxcc	a,a,y1
-!	addx	%g0,%g0,y2
-!	addcc	a,y1,x
-!	st	x,
-!
-!4_unrolled:
-!	ld	,a
-!	srl	a,2,x1		! 4a
-!	addxcc	y2,x1,x
-!	sll	a,30,x2
-!	st	x,
-!
-!	ld	,a
-!	srl	a,2,y1
-!	addxcc	x2,y1,y
-!	sll	a,30,y2
-!	st	x,
-!
-!5_unrolled:
-!	ld	,a
-!	srl	a,2,x1		! 4a
-!	addxcc	a,x1,x		! 5a + c
-!	sll	a,30,x2
-!	addx	%g0,x2,x2
-!	st	x,
-!
-!	ld	,a
-!	srl	a,2,y1
-!	addxcc	a,y1,x
-!	sll	a,30,y2
-!	addx	%g0,y2,y2
-!	st	x,
-!
-!8_unrolled:
-!	ld	,a
-!	srl	a,3,x1		! 8a
-!	addxcc	y2,x1,x
-!	sll	a,29,x2
-!	st	x,
-!
-!	ld	,a
-!	srl	a,3,y1
-!	addxcc	x2,y1,y
-!	sll	a,29,y2
-!	st	x,
+! res_ptr	o0
+! s1_ptr	o1
+! size		o2
+! s2_limb	o3
 
 #include <sysdep.h>
 
 ENTRY(__mpn_mul_1)
-	! Make S1_PTR and RES_PTR point at the end of their blocks
-	! and put (- 4 x SIZE) in index/loop counter.
-	sll	%o2,2,%o2
-	add	%o0,%o2,%o4	! RES_PTR in o4 since o0 is retval
-	add	%o1,%o2,%o1
-	sub	%g0,%o2,%o2
+	sll	%o2,4,%g1
+	mov	%o7,%g4			! Save return address register
+	and	%g1,(4-1)<<4,%g1
+1:	call	2f
+	 add	%o7,3f-1b,%g3
+2:	mov	%g4,%o7			! Restore return address register
+	jmp	%g3+%g1
+	 ld	[%o1+0],%o4	! 1
 
-	cmp	%o3,0xfff
-	bgu	LOC(large)
+	.align	4
+3:
+LOC(00):
+	add	%o0,-4,%o0
+	add	%o1,-4,%o1
+	b	LOC(loop00)		/* 4, 8, 12, ... */
+	 orcc	%g0,%g0,%g2
+LOC(01):
+	b	LOC(loop01)		/* 1, 5, 9, ... */
+	 orcc	%g0,%g0,%g2
 	nop
+	nop
+LOC(10):
+	add	%o0,-12,%o0	/* 2, 6, 10, ... */
+	add	%o1,4,%o1
+	b	LOC(loop10)
+	 orcc	%g0,%g0,%g2
+	nop
+LOC(11):
+	add	%o0,-8,%o0	/* 3, 7, 11, ... */
+	add	%o1,-8,%o1
+	b	LOC(loop11)
+	 orcc	%g0,%g0,%g2
 
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	b	LOC(0)
-	 add	%o4,-4,%o4
-LOC(loop0):
-	st	%g1,[%o4+%o2]
-LOC(0):	wr	%g0,%o3,%y
-	sra	%o5,31,%g2
-	and	%o3,%g2,%g2
-	andcc	%g1,0,%g1
-	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,0,%g1
-	sra	%g1,20,%g4
-	sll	%g1,12,%g1
- 	rd	%y,%g3
-	srl	%g3,20,%g3
-	or	%g1,%g3,%g1
-
-	addcc	%g1,%o0,%g1
-	addx	%g2,%g4,%o0	! add sign-compensation and cy to hi limb
-	addcc	%o2,4,%o2	! loop counter
-	bne,a	LOC(loop0)
-	 ld	[%o1+%o2],%o5
-
-	retl
-	st	%g1,[%o4+%o2]
-
-
-LOC(large):
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	sra	%o3,31,%g4	! g4 = mask of ones iff S2_LIMB < 0
-	b	LOC(1)
-	 add	%o4,-4,%o4
 LOC(loop):
-	st	%g3,[%o4+%o2]
-LOC(1):	wr	%g0,%o5,%y
-	and	%o5,%g4,%g2	! g2 = S1_LIMB iff S2_LIMB < 0, else 0
-	andcc	%g0,%g0,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%g0,%g1
-	rd	%y,%g3
-	addcc	%g3,%o0,%g3
-	addx	%g2,%g1,%o0	! add sign-compensation and cy to hi limb
-	addcc	%o2,4,%o2	! loop counter
-	bne,a	LOC(loop)
-	 ld	[%o1+%o2],%o5
+	addcc	%g3,%g2,%g3	! 1
+	ld	[%o1+4],%o4	! 2
+	st	%g3,[%o0+0]	! 1
+	rd	%y,%g2		! 1
+LOC(loop00):
+	umul	%o4,%o3,%g3	! 2
+	addxcc	%g3,%g2,%g3	! 2
+	ld	[%o1+8],%o4	! 3
+	st	%g3,[%o0+4]	! 2
+	rd	%y,%g2		! 2
+LOC(loop11):
+	umul	%o4,%o3,%g3	! 3
+	addxcc	%g3,%g2,%g3	! 3
+	ld	[%o1+12],%o4	! 4
+	add	%o1,16,%o1
+	st	%g3,[%o0+8]	! 3
+	rd	%y,%g2		! 3
+LOC(loop10):
+	umul	%o4,%o3,%g3	! 4
+	addxcc	%g3,%g2,%g3	! 4
+	ld	[%o1+0],%o4	! 1
+	st	%g3,[%o0+12]	! 4
+	add	%o0,16,%o0
+	rd	%y,%g2		! 4
+	addx	%g0,%g2,%g2
+LOC(loop01):
+	addcc	%o2,-4,%o2
+	bg	LOC(loop)
+	 umul	%o4,%o3,%g3	! 1
 
+	addcc	%g3,%g2,%g3	! 4
+	st	%g3,[%o0+0]	! 4
+	rd	%y,%g2		! 4
 	retl
-	st	%g3,[%o4+%o2]
+	 addx	%g0,%g2,%o0
 
 END(__mpn_mul_1)
diff --git a/sysdeps/sparc/sparc32/rem.S b/sysdeps/sparc/sparc32/rem.S
index 79e09a9ef8..a2694e699e 100644
--- a/sysdeps/sparc/sparc32/rem.S
+++ b/sysdeps/sparc/sparc32/rem.S
@@ -1,363 +1,21 @@
-   /* This file is generated from divrem.m4; DO NOT EDIT! */
 /*
- * Division and remainder, from Appendix E of the Sparc Version 8
- * Architecture Manual, with fixes from Gordon Irlam.
+ * Sparc v8 has divide.
  */
 
-/*
- * Input: dividend and divisor in %o0 and %o1 respectively.
- *
- * m4 parameters:
- *  .rem	name of function to generate
- *  rem		rem=div => %o0 / %o1; rem=rem => %o0 % %o1
- *  true		true=true => signed; true=false => unsigned
- *
- * Algorithm parameters:
- *  N		how many bits per iteration we try to get (4)
- *  WORDSIZE	total number of bits (32)
- *
- * Derived constants:
- *  TOPBITS	number of bits in the top decade of a number
- *
- * Important variables:
- *  Q		the partial quotient under development (initially 0)
- *  R		the remainder so far, initially the dividend
- *  ITER	number of main division loop iterations required;
- *		equal to ceil(log2(quotient) / N).  Note that this
- *		is the log base (2^N) of the quotient.
- *  V		the current comparand, initially divisor*2^(ITER*N-1)
- *
- * Cost:
- *  Current estimate for non-large dividend is
- *	ceil(log2(quotient) / N) * (10 + 7N/2) + C
- *  A large dividend is one greater than 2^(31-TOPBITS) and takes a
- *  different path, as the upper bits of the quotient must be developed
- *  one bit at a time.
- */
-
-
-
 #include <sysdep.h>
-#include <sys/trap.h>
 
 ENTRY(.rem)
-	! compute sign of result; if neither is negative, no problem
-	orcc	%o1, %o0, %g0	! either negative?
-	bge	2f			! no, go do the divide
-	mov	%o0, %g3		! sign of remainder matches %o0
-	tst	%o1
-	bge	1f
-	tst	%o0
-	! %o1 is definitely negative; %o0 might also be negative
-	bge	2f			! if %o0 not negative...
-	sub	%g0, %o1, %o1	! in any case, make %o1 nonneg
-1:	! %o0 is negative, %o1 is nonnegative
-	sub	%g0, %o0, %o0	! make %o0 nonnegative
-2:
-
-	! Ready to divide.  Compute size of quotient; scale comparand.
-	orcc	%o1, %g0, %o5
-	bne	1f
-	mov	%o0, %o3
-
-		! Divide by zero trap.  If it returns, return 0 (about as
-		! wrong as possible, but that is what SunOS does...).
-		ta	ST_DIV0
-		retl
-		clr	%o0
-
-1:
-	cmp	%o3, %o5			! if %o1 exceeds %o0, done
-	blu	LOC(got_result)		! (and algorithm fails otherwise)
-	clr	%o2
-	sethi	%hi(1 << (32 - 4 - 1)), %g1
-	cmp	%o3, %g1
-	blu	LOC(not_really_big)
-	clr	%o4
-
-	! Here the dividend is >= 2**(31-N) or so.  We must be careful here,
-	! as our usual N-at-a-shot divide step will cause overflow and havoc.
-	! The number of bits in the result here is N*ITER+SC, where SC <= N.
-	! Compute ITER in an unorthodox manner: know we need to shift V into
-	! the top decade: so do not even bother to compare to R.
-	1:
-		cmp	%o5, %g1
-		bgeu	3f
-		mov	1, %g2
-		sll	%o5, 4, %o5
-		b	1b
-		add	%o4, 1, %o4
-
-	! Now compute %g2.
-	2:	addcc	%o5, %o5, %o5
-		bcc	LOC(not_too_big)
-		add	%g2, 1, %g2
-
-		! We get here if the %o1 overflowed while shifting.
-		! This means that %o3 has the high-order bit set.
-		! Restore %o5 and subtract from %o3.
-		sll	%g1, 4, %g1	! high order bit
-		srl	%o5, 1, %o5		! rest of %o5
-		add	%o5, %g1, %o5
-		b	LOC(do_single_div)
-		sub	%g2, 1, %g2
-
-	LOC(not_too_big):
-	3:	cmp	%o5, %o3
-		blu	2b
-		nop
-		be	LOC(do_single_div)
-		nop
-	/* NB: these are commented out in the V8-Sparc manual as well */
-	/* (I do not understand this) */
-	! %o5 > %o3: went too far: back up 1 step
-	!	srl	%o5, 1, %o5
-	!	dec	%g2
-	! do single-bit divide steps
-	!
-	! We have to be careful here.  We know that %o3 >= %o5, so we can do the
-	! first divide step without thinking.  BUT, the others are conditional,
-	! and are only done if %o3 >= 0.  Because both %o3 and %o5 may have the high-
-	! order bit set in the first step, just falling into the regular
-	! division loop will mess up the first time around.
-	! So we unroll slightly...
-	LOC(do_single_div):
-		subcc	%g2, 1, %g2
-		bl	LOC(end_regular_divide)
-		nop
-		sub	%o3, %o5, %o3
-		mov	1, %o2
-		b	LOC(end_single_divloop)
-		nop
-	LOC(single_divloop):
-		sll	%o2, 1, %o2
-		bl	1f
-		srl	%o5, 1, %o5
-		! %o3 >= 0
-		sub	%o3, %o5, %o3
-		b	2f
-		add	%o2, 1, %o2
-	1:	! %o3 < 0
-		add	%o3, %o5, %o3
-		sub	%o2, 1, %o2
-	2:
-	LOC(end_single_divloop):
-		subcc	%g2, 1, %g2
-		bge	LOC(single_divloop)
-		tst	%o3
-		b,a	LOC(end_regular_divide)
-
-LOC(not_really_big):
-1:
-	sll	%o5, 4, %o5
-	cmp	%o5, %o3
-	bleu	1b
-	addcc	%o4, 1, %o4
-	be	LOC(got_result)
-	sub	%o4, 1, %o4
-
-	tst	%o3	! set up for initial iteration
-LOC(divloop):
-	sll	%o2, 4, %o2
-		! depth 1, accumulated bits 0
-	bl	LOC(1.16)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 2, accumulated bits 1
-	bl	LOC(2.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 3
-	bl	LOC(3.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 7
-	bl	LOC(4.23)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2+1), %o2
-
-LOC(4.23):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2-1), %o2
-
-
-LOC(3.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 5
-	bl	LOC(4.21)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2+1), %o2
-
-LOC(4.21):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2-1), %o2
-
-
-
-LOC(2.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 1
-	bl	LOC(3.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 3
-	bl	LOC(4.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2+1), %o2
-
-LOC(4.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2-1), %o2
-
-
-LOC(3.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 1
-	bl	LOC(4.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2+1), %o2
-
-LOC(4.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2-1), %o2
-
-
-
-
-LOC(1.16):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 2, accumulated bits -1
-	bl	LOC(2.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -1
-	bl	LOC(3.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -1
-	bl	LOC(4.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2+1), %o2
-
-LOC(4.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2-1), %o2
-
-
-LOC(3.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -3
-	bl	LOC(4.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2+1), %o2
-
-LOC(4.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2-1), %o2
-
-
-
-LOC(2.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -3
-	bl	LOC(3.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -5
-	bl	LOC(4.11)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2+1), %o2
-
-LOC(4.11):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2-1), %o2
-
-
-LOC(3.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -7
-	bl	LOC(4.9)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2+1), %o2
-
-LOC(4.9):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2-1), %o2
-
-
-
-
-9:
-LOC(end_regular_divide):
-	subcc	%o4, 1, %o4
-	bge	LOC(divloop)
-	tst	%o3
-	bl,a	LOC(got_result)
-	! non-restoring fixup here (one instruction only!)
-	add	%o3, %o1, %o3
-
 
-LOC(got_result):
-	! check to see if answer should be < 0
-	tst	%g3
-	bl,a	1f
-	sub %g0, %o3, %o3
-1:
+	sra	%o0, 31, %o2
+	wr	%o2, 0, %y
+	nop
+	nop
+	nop
+	sdivcc	%o0, %o1, %o2
+	bvs,a	1f
+	 xnor	%o2, %g0, %o2
+1:	smul	%o2, %o1, %o2
 	retl
-	mov %o3, %o0
+	 sub	%o0, %o2, %o0
 
 END(.rem)
diff --git a/sysdeps/sparc/sparc32/sdiv.S b/sysdeps/sparc/sparc32/sdiv.S
index ab29718827..bfc4acf2fa 100644
--- a/sysdeps/sparc/sparc32/sdiv.S
+++ b/sysdeps/sparc/sparc32/sdiv.S
@@ -1,363 +1,20 @@
-   /* This file is generated from divrem.m4; DO NOT EDIT! */
 /*
- * Division and remainder, from Appendix E of the Sparc Version 8
- * Architecture Manual, with fixes from Gordon Irlam.
+ * Sparc v8 has divide.
  */
 
-/*
- * Input: dividend and divisor in %o0 and %o1 respectively.
- *
- * m4 parameters:
- *  .div	name of function to generate
- *  div		div=div => %o0 / %o1; div=rem => %o0 % %o1
- *  true		true=true => signed; true=false => unsigned
- *
- * Algorithm parameters:
- *  N		how many bits per iteration we try to get (4)
- *  WORDSIZE	total number of bits (32)
- *
- * Derived constants:
- *  TOPBITS	number of bits in the top decade of a number
- *
- * Important variables:
- *  Q		the partial quotient under development (initially 0)
- *  R		the remainder so far, initially the dividend
- *  ITER	number of main division loop iterations required;
- *		equal to ceil(log2(quotient) / N).  Note that this
- *		is the log base (2^N) of the quotient.
- *  V		the current comparand, initially divisor*2^(ITER*N-1)
- *
- * Cost:
- *  Current estimate for non-large dividend is
- *	ceil(log2(quotient) / N) * (10 + 7N/2) + C
- *  A large dividend is one greater than 2^(31-TOPBITS) and takes a
- *  different path, as the upper bits of the quotient must be developed
- *  one bit at a time.
- */
-
-
-
 #include <sysdep.h>
-#include <sys/trap.h>
 
 ENTRY(.div)
-	! compute sign of result; if neither is negative, no problem
-	orcc	%o1, %o0, %g0	! either negative?
-	bge	2f			! no, go do the divide
-	xor	%o1, %o0, %g3	! compute sign in any case
-	tst	%o1
-	bge	1f
-	tst	%o0
-	! %o1 is definitely negative; %o0 might also be negative
-	bge	2f			! if %o0 not negative...
-	sub	%g0, %o1, %o1	! in any case, make %o1 nonneg
-1:	! %o0 is negative, %o1 is nonnegative
-	sub	%g0, %o0, %o0	! make %o0 nonnegative
-2:
-
-	! Ready to divide.  Compute size of quotient; scale comparand.
-	orcc	%o1, %g0, %o5
-	bne	1f
-	mov	%o0, %o3
-
-		! Divide by zero trap.  If it returns, return 0 (about as
-		! wrong as possible, but that is what SunOS does...).
-		ta	ST_DIV0
-		retl
-		clr	%o0
-
-1:
-	cmp	%o3, %o5			! if %o1 exceeds %o0, done
-	blu	LOC(got_result)		! (and algorithm fails otherwise)
-	clr	%o2
-	sethi	%hi(1 << (32 - 4 - 1)), %g1
-	cmp	%o3, %g1
-	blu	LOC(not_really_big)
-	clr	%o4
-
-	! Here the dividend is >= 2**(31-N) or so.  We must be careful here,
-	! as our usual N-at-a-shot divide step will cause overflow and havoc.
-	! The number of bits in the result here is N*ITER+SC, where SC <= N.
-	! Compute ITER in an unorthodox manner: know we need to shift V into
-	! the top decade: so do not even bother to compare to R.
-	1:
-		cmp	%o5, %g1
-		bgeu	3f
-		mov	1, %g2
-		sll	%o5, 4, %o5
-		b	1b
-		add	%o4, 1, %o4
-
-	! Now compute %g2.
-	2:	addcc	%o5, %o5, %o5
-		bcc	LOC(not_too_big)
-		add	%g2, 1, %g2
-
-		! We get here if the %o1 overflowed while shifting.
-		! This means that %o3 has the high-order bit set.
-		! Restore %o5 and subtract from %o3.
-		sll	%g1, 4, %g1	! high order bit
-		srl	%o5, 1, %o5		! rest of %o5
-		add	%o5, %g1, %o5
-		b	LOC(do_single_div)
-		sub	%g2, 1, %g2
-
-	LOC(not_too_big):
-	3:	cmp	%o5, %o3
-		blu	2b
-		nop
-		be	LOC(do_single_div)
-		nop
-	/* NB: these are commented out in the V8-Sparc manual as well */
-	/* (I do not understand this) */
-	! %o5 > %o3: went too far: back up 1 step
-	!	srl	%o5, 1, %o5
-	!	dec	%g2
-	! do single-bit divide steps
-	!
-	! We have to be careful here.  We know that %o3 >= %o5, so we can do the
-	! first divide step without thinking.  BUT, the others are conditional,
-	! and are only done if %o3 >= 0.  Because both %o3 and %o5 may have the high-
-	! order bit set in the first step, just falling into the regular
-	! division loop will mess up the first time around.
-	! So we unroll slightly...
-	LOC(do_single_div):
-		subcc	%g2, 1, %g2
-		bl	LOC(end_regular_divide)
-		nop
-		sub	%o3, %o5, %o3
-		mov	1, %o2
-		b	LOC(end_single_divloop)
-		nop
-	LOC(single_divloop):
-		sll	%o2, 1, %o2
-		bl	1f
-		srl	%o5, 1, %o5
-		! %o3 >= 0
-		sub	%o3, %o5, %o3
-		b	2f
-		add	%o2, 1, %o2
-	1:	! %o3 < 0
-		add	%o3, %o5, %o3
-		sub	%o2, 1, %o2
-	2:
-	LOC(end_single_divloop):
-		subcc	%g2, 1, %g2
-		bge	LOC(single_divloop)
-		tst	%o3
-		b,a	LOC(end_regular_divide)
-
-LOC(not_really_big):
-1:
-	sll	%o5, 4, %o5
-	cmp	%o5, %o3
-	bleu	1b
-	addcc	%o4, 1, %o4
-	be	LOC(got_result)
-	sub	%o4, 1, %o4
-
-	tst	%o3	! set up for initial iteration
-LOC(divloop):
-	sll	%o2, 4, %o2
-		! depth 1, accumulated bits 0
-	bl	LOC(1.16)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 2, accumulated bits 1
-	bl	LOC(2.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 3
-	bl	LOC(3.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 7
-	bl	LOC(4.23)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2+1), %o2
-
-LOC(4.23):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2-1), %o2
-
-
-LOC(3.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 5
-	bl	LOC(4.21)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2+1), %o2
-
-LOC(4.21):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2-1), %o2
-
-
-
-LOC(2.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 1
-	bl	LOC(3.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 3
-	bl	LOC(4.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2+1), %o2
-
-LOC(4.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2-1), %o2
-
-
-LOC(3.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 1
-	bl	LOC(4.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2+1), %o2
-
-LOC(4.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2-1), %o2
-
-
-
-
-LOC(1.16):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 2, accumulated bits -1
-	bl	LOC(2.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -1
-	bl	LOC(3.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -1
-	bl	LOC(4.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2+1), %o2
-
-LOC(4.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2-1), %o2
-
-
-LOC(3.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -3
-	bl	LOC(4.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2+1), %o2
-
-LOC(4.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2-1), %o2
-
-
-
-LOC(2.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -3
-	bl	LOC(3.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -5
-	bl	LOC(4.11)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2+1), %o2
-
-LOC(4.11):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2-1), %o2
-
-
-LOC(3.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -7
-	bl	LOC(4.9)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2+1), %o2
-
-LOC(4.9):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2-1), %o2
-
-
-
-
-9:
-LOC(end_regular_divide):
-	subcc	%o4, 1, %o4
-	bge	LOC(divloop)
-	tst	%o3
-	bl,a	LOC(got_result)
-	! non-restoring fixup here (one instruction only!)
-	sub	%o2, 1, %o2
-
 
-LOC(got_result):
-	! check to see if answer should be < 0
-	tst	%g3
-	bl,a	1f
-	sub %g0, %o2, %o2
-1:
-	retl
-	mov %o2, %o0
+	sra	%o0, 31, %o2
+	wr	%o2, 0, %y
+	nop
+	nop
+	nop
+	sdivcc	%o0, %o1, %o0
+	bvs,a	1f
+	 xnor	%o0, %g0, %o0
+1:	retl
+	 nop
 
 END(.div)
diff --git a/sysdeps/sparc/sparc32/sparcv8/Makefile b/sysdeps/sparc/sparc32/sparcv8/Makefile
deleted file mode 100644
index 2ff9853458..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/Makefile
+++ /dev/null
@@ -1 +0,0 @@
-sysdep-CFLAGS += -mcpu=v8
diff --git a/sysdeps/sparc/sparc32/sparcv8/addmul_1.S b/sysdeps/sparc/sparc32/sparcv8/addmul_1.S
deleted file mode 100644
index 4708393aa5..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/addmul_1.S
+++ /dev/null
@@ -1,118 +0,0 @@
-! SPARC v8 __mpn_addmul_1 -- Multiply a limb vector with a limb and
-! add the result to a second limb vector.
-
-! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-
-! This file is part of the GNU MP Library.
-
-! The GNU MP Library is free software; you can redistribute it and/or modify
-! it under the terms of the GNU Lesser General Public License as published by
-! the Free Software Foundation; either version 2.1 of the License, or (at your
-! option) any later version.
-
-! The GNU MP Library is distributed in the hope that it will be useful, but
-! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
-! License for more details.
-
-! You should have received a copy of the GNU Lesser General Public License
-! along with the GNU MP Library; see the file COPYING.LIB.  If not,
-! see <https://www.gnu.org/licenses/>.
-
-
-! INPUT PARAMETERS
-! res_ptr	o0
-! s1_ptr	o1
-! size		o2
-! s2_limb	o3
-
-#include <sysdep.h>
-
-ENTRY(__mpn_addmul_1)
-	ld	[%o1+0],%o4	! 1
-	sll	%o2,4,%g1
-	orcc	%g0,%g0,%g2
-	mov	%o7,%g4			! Save return address register
-	and	%g1,(4-1)<<4,%g1
-1:	call	2f
-	 add	%o7,3f-1b,%g3
-2:	jmp	%g3+%g1
-	 mov	%g4,%o7			! Restore return address register
-
-	.align	4
-3:
-LOC(00):
-	add	%o0,-4,%o0
-	b	LOC(loop00)		/* 4, 8, 12, ... */
-	 add	%o1,-4,%o1
-	nop
-LOC(01):
-	b	LOC(loop01)		/* 1, 5, 9, ... */
-	 nop
-	nop
-	nop
-LOC(10):
-	add	%o0,-12,%o0	/* 2, 6, 10, ... */
-	b	LOC(loop10)
-	 add	%o1,4,%o1
-	nop
-LOC(11):
-	add	%o0,-8,%o0	/* 3, 7, 11, ... */
-	b	LOC(loop11)
-	 add	%o1,-8,%o1
-	nop
-
-LOC(loop):
-	addcc	%g3,%g2,%g3	! 1
-	ld	[%o1+4],%o4	! 2
-	rd	%y,%g2		! 1
-	addx	%g0,%g2,%g2
-	ld	[%o0+0],%g1	! 2
-	addcc	%g1,%g3,%g3
-	st	%g3,[%o0+0]	! 1
-LOC(loop00):
-	umul	%o4,%o3,%g3	! 2
-	ld	[%o0+4],%g1	! 2
-	addxcc	%g3,%g2,%g3	! 2
-	ld	[%o1+8],%o4	! 3
-	rd	%y,%g2		! 2
-	addx	%g0,%g2,%g2
-	nop
-	addcc	%g1,%g3,%g3
-	st	%g3,[%o0+4]	! 2
-LOC(loop11):
-	umul	%o4,%o3,%g3	! 3
-	addxcc	%g3,%g2,%g3	! 3
-	ld	[%o1+12],%o4	! 4
-	rd	%y,%g2		! 3
-	add	%o1,16,%o1
-	addx	%g0,%g2,%g2
-	ld	[%o0+8],%g1	! 2
-	addcc	%g1,%g3,%g3
-	st	%g3,[%o0+8]	! 3
-LOC(loop10):
-	umul	%o4,%o3,%g3	! 4
-	addxcc	%g3,%g2,%g3	! 4
-	ld	[%o1+0],%o4	! 1
-	rd	%y,%g2		! 4
-	addx	%g0,%g2,%g2
-	ld	[%o0+12],%g1	! 2
-	addcc	%g1,%g3,%g3
-	st	%g3,[%o0+12]	! 4
-	add	%o0,16,%o0
-	addx	%g0,%g2,%g2
-LOC(loop01):
-	addcc	%o2,-4,%o2
-	bg	LOC(loop)
-	 umul	%o4,%o3,%g3	! 1
-
-	addcc	%g3,%g2,%g3	! 4
-	rd	%y,%g2		! 4
-	addx	%g0,%g2,%g2
-	ld	[%o0+0],%g1	! 2
-	addcc	%g1,%g3,%g3
-	st	%g3,[%o0+0]	! 4
-	retl
-	 addx	%g0,%g2,%o0
-
-END(__mpn_addmul_1)
diff --git a/sysdeps/sparc/sparc32/sparcv8/dotmul.S b/sysdeps/sparc/sparc32/sparcv8/dotmul.S
deleted file mode 100644
index 9b20cc3684..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/dotmul.S
+++ /dev/null
@@ -1,13 +0,0 @@
-/*
- * Sparc v8 has multiply.
- */
-
-#include <sysdep.h>
-
-ENTRY(.mul)
-
-	smul	%o0, %o1, %o0
-	retl
-	 rd	%y, %o1
-
-END(.mul)
diff --git a/sysdeps/sparc/sparc32/sparcv8/mul_1.S b/sysdeps/sparc/sparc32/sparcv8/mul_1.S
deleted file mode 100644
index 86619c71d6..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/mul_1.S
+++ /dev/null
@@ -1,102 +0,0 @@
-! SPARC v8 __mpn_mul_1 -- Multiply a limb vector with a single limb and
-! store the product in a second limb vector.
-
-! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-
-! This file is part of the GNU MP Library.
-
-! The GNU MP Library is free software; you can redistribute it and/or modify
-! it under the terms of the GNU Lesser General Public License as published by
-! the Free Software Foundation; either version 2.1 of the License, or (at your
-! option) any later version.
-
-! The GNU MP Library is distributed in the hope that it will be useful, but
-! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
-! License for more details.
-
-! You should have received a copy of the GNU Lesser General Public License
-! along with the GNU MP Library; see the file COPYING.LIB.  If not,
-! see <https://www.gnu.org/licenses/>.
-
-
-! INPUT PARAMETERS
-! res_ptr	o0
-! s1_ptr	o1
-! size		o2
-! s2_limb	o3
-
-#include <sysdep.h>
-
-ENTRY(__mpn_mul_1)
-	sll	%o2,4,%g1
-	mov	%o7,%g4			! Save return address register
-	and	%g1,(4-1)<<4,%g1
-1:	call	2f
-	 add	%o7,3f-1b,%g3
-2:	mov	%g4,%o7			! Restore return address register
-	jmp	%g3+%g1
-	 ld	[%o1+0],%o4	! 1
-
-	.align	4
-3:
-LOC(00):
-	add	%o0,-4,%o0
-	add	%o1,-4,%o1
-	b	LOC(loop00)		/* 4, 8, 12, ... */
-	 orcc	%g0,%g0,%g2
-LOC(01):
-	b	LOC(loop01)		/* 1, 5, 9, ... */
-	 orcc	%g0,%g0,%g2
-	nop
-	nop
-LOC(10):
-	add	%o0,-12,%o0	/* 2, 6, 10, ... */
-	add	%o1,4,%o1
-	b	LOC(loop10)
-	 orcc	%g0,%g0,%g2
-	nop
-LOC(11):
-	add	%o0,-8,%o0	/* 3, 7, 11, ... */
-	add	%o1,-8,%o1
-	b	LOC(loop11)
-	 orcc	%g0,%g0,%g2
-
-LOC(loop):
-	addcc	%g3,%g2,%g3	! 1
-	ld	[%o1+4],%o4	! 2
-	st	%g3,[%o0+0]	! 1
-	rd	%y,%g2		! 1
-LOC(loop00):
-	umul	%o4,%o3,%g3	! 2
-	addxcc	%g3,%g2,%g3	! 2
-	ld	[%o1+8],%o4	! 3
-	st	%g3,[%o0+4]	! 2
-	rd	%y,%g2		! 2
-LOC(loop11):
-	umul	%o4,%o3,%g3	! 3
-	addxcc	%g3,%g2,%g3	! 3
-	ld	[%o1+12],%o4	! 4
-	add	%o1,16,%o1
-	st	%g3,[%o0+8]	! 3
-	rd	%y,%g2		! 3
-LOC(loop10):
-	umul	%o4,%o3,%g3	! 4
-	addxcc	%g3,%g2,%g3	! 4
-	ld	[%o1+0],%o4	! 1
-	st	%g3,[%o0+12]	! 4
-	add	%o0,16,%o0
-	rd	%y,%g2		! 4
-	addx	%g0,%g2,%g2
-LOC(loop01):
-	addcc	%o2,-4,%o2
-	bg	LOC(loop)
-	 umul	%o4,%o3,%g3	! 1
-
-	addcc	%g3,%g2,%g3	! 4
-	st	%g3,[%o0+0]	! 4
-	rd	%y,%g2		! 4
-	retl
-	 addx	%g0,%g2,%o0
-
-END(__mpn_mul_1)
diff --git a/sysdeps/sparc/sparc32/sparcv8/rem.S b/sysdeps/sparc/sparc32/sparcv8/rem.S
deleted file mode 100644
index a2694e699e..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/rem.S
+++ /dev/null
@@ -1,21 +0,0 @@
-/*
- * Sparc v8 has divide.
- */
-
-#include <sysdep.h>
-
-ENTRY(.rem)
-
-	sra	%o0, 31, %o2
-	wr	%o2, 0, %y
-	nop
-	nop
-	nop
-	sdivcc	%o0, %o1, %o2
-	bvs,a	1f
-	 xnor	%o2, %g0, %o2
-1:	smul	%o2, %o1, %o2
-	retl
-	 sub	%o0, %o2, %o0
-
-END(.rem)
diff --git a/sysdeps/sparc/sparc32/sparcv8/sdiv.S b/sysdeps/sparc/sparc32/sparcv8/sdiv.S
deleted file mode 100644
index bfc4acf2fa..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/sdiv.S
+++ /dev/null
@@ -1,20 +0,0 @@
-/*
- * Sparc v8 has divide.
- */
-
-#include <sysdep.h>
-
-ENTRY(.div)
-
-	sra	%o0, 31, %o2
-	wr	%o2, 0, %y
-	nop
-	nop
-	nop
-	sdivcc	%o0, %o1, %o0
-	bvs,a	1f
-	 xnor	%o0, %g0, %o0
-1:	retl
-	 nop
-
-END(.div)
diff --git a/sysdeps/sparc/sparc32/sparcv8/submul_1.S b/sysdeps/sparc/sparc32/sparcv8/submul_1.S
deleted file mode 100644
index 11eef05300..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/submul_1.S
+++ /dev/null
@@ -1,57 +0,0 @@
-! SPARC v8 __mpn_submul_1 -- Multiply a limb vector with a limb and
-! subtract the result from a second limb vector.
-
-! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-
-! This file is part of the GNU MP Library.
-
-! The GNU MP Library is free software; you can redistribute it and/or modify
-! it under the terms of the GNU Lesser General Public License as published by
-! the Free Software Foundation; either version 2.1 of the License, or (at your
-! option) any later version.
-
-! The GNU MP Library is distributed in the hope that it will be useful, but
-! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
-! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
-! License for more details.
-
-! You should have received a copy of the GNU Lesser General Public License
-! along with the GNU MP Library; see the file COPYING.LIB.  If not,
-! see <https://www.gnu.org/licenses/>.
-
-
-! INPUT PARAMETERS
-! res_ptr	o0
-! s1_ptr	o1
-! size		o2
-! s2_limb	o3
-
-#include <sysdep.h>
-
-ENTRY(__mpn_submul_1)
-	sub	%g0,%o2,%o2		! negate ...
-	sll	%o2,2,%o2		! ... and scale size
-	sub	%o1,%o2,%o1		! o1 is offset s1_ptr
-	sub	%o0,%o2,%g1		! g1 is offset res_ptr
-
-	mov	0,%o0			! clear cy_limb
-
-LOC(loop):
-	ld	[%o1+%o2],%o4
-	ld	[%g1+%o2],%g2
-	umul	%o4,%o3,%o5
-	rd	%y,%g3
-	addcc	%o5,%o0,%o5
-	addx	%g3,0,%o0
-	subcc	%g2,%o5,%g2
-	addx	%o0,0,%o0
-	st	%g2,[%g1+%o2]
-
-	addcc	%o2,4,%o2
-	bne	LOC(loop)
-	 nop
-
-	retl
-	 nop
-
-END(__mpn_submul_1)
diff --git a/sysdeps/sparc/sparc32/sparcv8/udiv.S b/sysdeps/sparc/sparc32/sparcv8/udiv.S
deleted file mode 100644
index e9cab4e4ef..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/udiv.S
+++ /dev/null
@@ -1,16 +0,0 @@
-/*
- * Sparc v8 has divide.
- */
-
-#include <sysdep.h>
-
-ENTRY(.udiv)
-
-	wr	%g0, 0, %y
-	nop
-	nop
-	retl
-	 udiv	%o0, %o1, %o0
-
-END(.udiv)
-strong_alias (.udiv, __wrap_.udiv)
diff --git a/sysdeps/sparc/sparc32/sparcv8/umul.S b/sysdeps/sparc/sparc32/sparcv8/umul.S
deleted file mode 100644
index cec454a7dd..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/umul.S
+++ /dev/null
@@ -1,13 +0,0 @@
-/*
- * Sparc v8 has multiply.
- */
-
-#include <sysdep.h>
-
-ENTRY(.umul)
-
-	umul	%o0, %o1, %o0
-	retl
-	 rd	%y, %o1
-
-END(.umul)
diff --git a/sysdeps/sparc/sparc32/sparcv8/urem.S b/sysdeps/sparc/sparc32/sparcv8/urem.S
deleted file mode 100644
index cc2689d514..0000000000
--- a/sysdeps/sparc/sparc32/sparcv8/urem.S
+++ /dev/null
@@ -1,18 +0,0 @@
-/*
- * Sparc v8 has divide.
- */
-
-#include <sysdep.h>
-
-ENTRY(.urem)
-
-	wr	%g0, 0, %y
-	nop
-	nop
-	nop
-	udiv	%o0, %o1, %o2
-	umul	%o2, %o1, %o2
-	retl
-	 sub	%o0, %o2, %o0
-
-END(.urem)
diff --git a/sysdeps/sparc/sparc32/submul_1.S b/sysdeps/sparc/sparc32/submul_1.S
index 920422adf1..11eef05300 100644
--- a/sysdeps/sparc/sparc32/submul_1.S
+++ b/sysdeps/sparc/sparc32/submul_1.S
@@ -1,146 +1,57 @@
-! SPARC __mpn_submul_1 -- Multiply a limb vector with a limb and subtract
-! the result from a second limb vector.
-!
+! SPARC v8 __mpn_submul_1 -- Multiply a limb vector with a limb and
+! subtract the result from a second limb vector.
+
 ! Copyright (C) 1992-2019 Free Software Foundation, Inc.
-!
+
 ! This file is part of the GNU MP Library.
-!
+
 ! The GNU MP Library is free software; you can redistribute it and/or modify
 ! it under the terms of the GNU Lesser General Public License as published by
 ! the Free Software Foundation; either version 2.1 of the License, or (at your
 ! option) any later version.
-!
+
 ! The GNU MP Library is distributed in the hope that it will be useful, but
 ! WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
 ! or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
 ! License for more details.
-!
+
 ! You should have received a copy of the GNU Lesser General Public License
 ! along with the GNU MP Library; see the file COPYING.LIB.  If not,
 ! see <https://www.gnu.org/licenses/>.
 
 
 ! INPUT PARAMETERS
-! RES_PTR	o0
-! S1_PTR	o1
-! SIZE		o2
-! S2_LIMB	o3
+! res_ptr	o0
+! s1_ptr	o1
+! size		o2
+! s2_limb	o3
 
 #include <sysdep.h>
 
 ENTRY(__mpn_submul_1)
-	! Make S1_PTR and RES_PTR point at the end of their blocks
-	! and put (- 4 x SIZE) in index/loop counter.
-	sll	%o2,2,%o2
-	add	%o0,%o2,%o4	! RES_PTR in o4 since o0 is retval
-	add	%o1,%o2,%o1
-	sub	%g0,%o2,%o2
-
-	cmp	%o3,0xfff
-	bgu	LOC(large)
-	nop
+	sub	%g0,%o2,%o2		! negate ...
+	sll	%o2,2,%o2		! ... and scale size
+	sub	%o1,%o2,%o1		! o1 is offset s1_ptr
+	sub	%o0,%o2,%g1		! g1 is offset res_ptr
 
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	b	LOC(0)
-	 add	%o4,-4,%o4
-LOC(loop0):
-	subcc	%o5,%g1,%g1
-	ld	[%o1+%o2],%o5
-	addx	%o0,%g0,%o0
-	st	%g1,[%o4+%o2]
-LOC(0):	wr	%g0,%o3,%y
-	sra	%o5,31,%g2
-	and	%o3,%g2,%g2
-	andcc	%g1,0,%g1
-	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
- 	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,%o5,%g1
-	mulscc	%g1,0,%g1
-	sra	%g1,20,%g4
-	sll	%g1,12,%g1
- 	rd	%y,%g3
-	srl	%g3,20,%g3
-	or	%g1,%g3,%g1
+	mov	0,%o0			! clear cy_limb
 
-	addcc	%g1,%o0,%g1
-	addx	%g2,%g4,%o0	! add sign-compensation and cy to hi limb
-	addcc	%o2,4,%o2	! loop counter
-	bne	LOC(loop0)
-	 ld	[%o4+%o2],%o5
-
-	subcc	%o5,%g1,%g1
-	addx	%o0,%g0,%o0
-	retl
-	st	%g1,[%o4+%o2]
-
-
-LOC(large):
-	ld	[%o1+%o2],%o5
-	mov	0,%o0
-	sra	%o3,31,%g4	! g4 = mask of ones iff S2_LIMB < 0
-	b	LOC(1)
-	 add	%o4,-4,%o4
 LOC(loop):
-	subcc	%o5,%g3,%g3
-	ld	[%o1+%o2],%o5
-	addx	%o0,%g0,%o0
-	st	%g3,[%o4+%o2]
-LOC(1):	wr	%g0,%o5,%y
-	and	%o5,%g4,%g2
-	andcc	%g0,%g0,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%o3,%g1
-	mulscc	%g1,%g0,%g1
+	ld	[%o1+%o2],%o4
+	ld	[%g1+%o2],%g2
+	umul	%o4,%o3,%o5
 	rd	%y,%g3
-	addcc	%g3,%o0,%g3
-	addx	%g2,%g1,%o0
+	addcc	%o5,%o0,%o5
+	addx	%g3,0,%o0
+	subcc	%g2,%o5,%g2
+	addx	%o0,0,%o0
+	st	%g2,[%g1+%o2]
+
 	addcc	%o2,4,%o2
 	bne	LOC(loop)
-	 ld	[%o4+%o2],%o5
+	 nop
 
-	subcc	%o5,%g3,%g3
-	addx	%o0,%g0,%o0
 	retl
-	st	%g3,[%o4+%o2]
+	 nop
 
 END(__mpn_submul_1)
diff --git a/sysdeps/sparc/sparc32/udiv.S b/sysdeps/sparc/sparc32/udiv.S
index 1db6796431..e9cab4e4ef 100644
--- a/sysdeps/sparc/sparc32/udiv.S
+++ b/sysdeps/sparc/sparc32/udiv.S
@@ -1,347 +1,16 @@
-   /* This file is generated from divrem.m4; DO NOT EDIT! */
 /*
- * Division and remainder, from Appendix E of the Sparc Version 8
- * Architecture Manual, with fixes from Gordon Irlam.
+ * Sparc v8 has divide.
  */
 
-/*
- * Input: dividend and divisor in %o0 and %o1 respectively.
- *
- * m4 parameters:
- *  .udiv	name of function to generate
- *  div		div=div => %o0 / %o1; div=rem => %o0 % %o1
- *  false		false=true => signed; false=false => unsigned
- *
- * Algorithm parameters:
- *  N		how many bits per iteration we try to get (4)
- *  WORDSIZE	total number of bits (32)
- *
- * Derived constants:
- *  TOPBITS	number of bits in the top decade of a number
- *
- * Important variables:
- *  Q		the partial quotient under development (initially 0)
- *  R		the remainder so far, initially the dividend
- *  ITER	number of main division loop iterations required;
- *		equal to ceil(log2(quotient) / N).  Note that this
- *		is the log base (2^N) of the quotient.
- *  V		the current comparand, initially divisor*2^(ITER*N-1)
- *
- * Cost:
- *  Current estimate for non-large dividend is
- *	ceil(log2(quotient) / N) * (10 + 7N/2) + C
- *  A large dividend is one greater than 2^(31-TOPBITS) and takes a
- *  different path, as the upper bits of the quotient must be developed
- *  one bit at a time.
- */
-
-
-
 #include <sysdep.h>
-#include <sys/trap.h>
 
 ENTRY(.udiv)
 
-	! Ready to divide.  Compute size of quotient; scale comparand.
-	orcc	%o1, %g0, %o5
-	bne	1f
-	mov	%o0, %o3
-
-		! Divide by zero trap.  If it returns, return 0 (about as
-		! wrong as possible, but that is what SunOS does...).
-		ta	ST_DIV0
-		retl
-		clr	%o0
-
-1:
-	cmp	%o3, %o5			! if %o1 exceeds %o0, done
-	blu	LOC(got_result)		! (and algorithm fails otherwise)
-	clr	%o2
-	sethi	%hi(1 << (32 - 4 - 1)), %g1
-	cmp	%o3, %g1
-	blu	LOC(not_really_big)
-	clr	%o4
-
-	! Here the dividend is >= 2**(31-N) or so.  We must be careful here,
-	! as our usual N-at-a-shot divide step will cause overflow and havoc.
-	! The number of bits in the result here is N*ITER+SC, where SC <= N.
-	! Compute ITER in an unorthodox manner: know we need to shift V into
-	! the top decade: so do not even bother to compare to R.
-	1:
-		cmp	%o5, %g1
-		bgeu	3f
-		mov	1, %g2
-		sll	%o5, 4, %o5
-		b	1b
-		add	%o4, 1, %o4
-
-	! Now compute %g2.
-	2:	addcc	%o5, %o5, %o5
-		bcc	LOC(not_too_big)
-		add	%g2, 1, %g2
-
-		! We get here if the %o1 overflowed while shifting.
-		! This means that %o3 has the high-order bit set.
-		! Restore %o5 and subtract from %o3.
-		sll	%g1, 4, %g1	! high order bit
-		srl	%o5, 1, %o5		! rest of %o5
-		add	%o5, %g1, %o5
-		b	LOC(do_single_div)
-		sub	%g2, 1, %g2
-
-	LOC(not_too_big):
-	3:	cmp	%o5, %o3
-		blu	2b
-		nop
-		be	LOC(do_single_div)
-		nop
-	/* NB: these are commented out in the V8-Sparc manual as well */
-	/* (I do not understand this) */
-	! %o5 > %o3: went too far: back up 1 step
-	!	srl	%o5, 1, %o5
-	!	dec	%g2
-	! do single-bit divide steps
-	!
-	! We have to be careful here.  We know that %o3 >= %o5, so we can do the
-	! first divide step without thinking.  BUT, the others are conditional,
-	! and are only done if %o3 >= 0.  Because both %o3 and %o5 may have the high-
-	! order bit set in the first step, just falling into the regular
-	! division loop will mess up the first time around.
-	! So we unroll slightly...
-	LOC(do_single_div):
-		subcc	%g2, 1, %g2
-		bl	LOC(end_regular_divide)
-		nop
-		sub	%o3, %o5, %o3
-		mov	1, %o2
-		b	LOC(end_single_divloop)
-		nop
-	LOC(single_divloop):
-		sll	%o2, 1, %o2
-		bl	1f
-		srl	%o5, 1, %o5
-		! %o3 >= 0
-		sub	%o3, %o5, %o3
-		b	2f
-		add	%o2, 1, %o2
-	1:	! %o3 < 0
-		add	%o3, %o5, %o3
-		sub	%o2, 1, %o2
-	2:
-	LOC(end_single_divloop):
-		subcc	%g2, 1, %g2
-		bge	LOC(single_divloop)
-		tst	%o3
-		b,a	LOC(end_regular_divide)
-
-LOC(not_really_big):
-1:
-	sll	%o5, 4, %o5
-	cmp	%o5, %o3
-	bleu	1b
-	addcc	%o4, 1, %o4
-	be	LOC(got_result)
-	sub	%o4, 1, %o4
-
-	tst	%o3	! set up for initial iteration
-LOC(divloop):
-	sll	%o2, 4, %o2
-		! depth 1, accumulated bits 0
-	bl	LOC(1.16)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 2, accumulated bits 1
-	bl	LOC(2.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 3
-	bl	LOC(3.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 7
-	bl	LOC(4.23)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2+1), %o2
-
-LOC(4.23):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2-1), %o2
-
-
-LOC(3.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 5
-	bl	LOC(4.21)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2+1), %o2
-
-LOC(4.21):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2-1), %o2
-
-
-
-LOC(2.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 1
-	bl	LOC(3.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 3
-	bl	LOC(4.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2+1), %o2
-
-LOC(4.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2-1), %o2
-
-
-LOC(3.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 1
-	bl	LOC(4.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2+1), %o2
-
-LOC(4.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2-1), %o2
-
-
-
-
-LOC(1.16):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 2, accumulated bits -1
-	bl	LOC(2.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -1
-	bl	LOC(3.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -1
-	bl	LOC(4.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2+1), %o2
-
-LOC(4.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2-1), %o2
-
-
-LOC(3.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -3
-	bl	LOC(4.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2+1), %o2
-
-LOC(4.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2-1), %o2
-
-
-
-LOC(2.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -3
-	bl	LOC(3.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -5
-	bl	LOC(4.11)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2+1), %o2
-
-LOC(4.11):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2-1), %o2
-
-
-LOC(3.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -7
-	bl	LOC(4.9)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2+1), %o2
-
-LOC(4.9):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2-1), %o2
-
-
-
-
-9:
-LOC(end_regular_divide):
-	subcc	%o4, 1, %o4
-	bge	LOC(divloop)
-	tst	%o3
-	bl,a	LOC(got_result)
-	! non-restoring fixup here (one instruction only!)
-	sub	%o2, 1, %o2
-
-
-LOC(got_result):
-
+	wr	%g0, 0, %y
+	nop
+	nop
 	retl
-	mov %o2, %o0
+	 udiv	%o0, %o1, %o0
 
 END(.udiv)
 strong_alias (.udiv, __wrap_.udiv)
diff --git a/sysdeps/sparc/sparc32/umul.S b/sysdeps/sparc/sparc32/umul.S
index 096554a2bc..cec454a7dd 100644
--- a/sysdeps/sparc/sparc32/umul.S
+++ b/sysdeps/sparc/sparc32/umul.S
@@ -1,155 +1,13 @@
 /*
- * Unsigned multiply.  Returns %o0 * %o1 in %o1%o0 (i.e., %o1 holds the
- * upper 32 bits of the 64-bit product).
- *
- * This code optimizes short (less than 13-bit) multiplies.  Short
- * multiplies require 25 instruction cycles, and long ones require
- * 45 instruction cycles.
- *
- * On return, overflow has occurred (%o1 is not zero) if and only if
- * the Z condition code is clear, allowing, e.g., the following:
- *
- *	call	.umul
- *	nop
- *	bnz	overflow	(or tnz)
+ * Sparc v8 has multiply.
  */
 
 #include <sysdep.h>
 
 ENTRY(.umul)
-	or	%o0, %o1, %o4
-	mov	%o0, %y			! multiplier -> Y
-	andncc	%o4, 0xfff, %g0		! test bits 12..31 of *both* args
-	be	LOC(mul_shortway)	! if zero, can do it the short way
-	 andcc	%g0, %g0, %o4		! zero the partial product; clear N & V
 
-	/*
-	 * Long multiply.  32 steps, followed by a final shift step.
-	 */
-	mulscc	%o4, %o1, %o4	! 1
-	mulscc	%o4, %o1, %o4	! 2
-	mulscc	%o4, %o1, %o4	! 3
-	mulscc	%o4, %o1, %o4	! 4
-	mulscc	%o4, %o1, %o4	! 5
-	mulscc	%o4, %o1, %o4	! 6
-	mulscc	%o4, %o1, %o4	! 7
-	mulscc	%o4, %o1, %o4	! 8
-	mulscc	%o4, %o1, %o4	! 9
-	mulscc	%o4, %o1, %o4	! 10
-	mulscc	%o4, %o1, %o4	! 11
-	mulscc	%o4, %o1, %o4	! 12
-	mulscc	%o4, %o1, %o4	! 13
-	mulscc	%o4, %o1, %o4	! 14
-	mulscc	%o4, %o1, %o4	! 15
-	mulscc	%o4, %o1, %o4	! 16
-	mulscc	%o4, %o1, %o4	! 17
-	mulscc	%o4, %o1, %o4	! 18
-	mulscc	%o4, %o1, %o4	! 19
-	mulscc	%o4, %o1, %o4	! 20
-	mulscc	%o4, %o1, %o4	! 21
-	mulscc	%o4, %o1, %o4	! 22
-	mulscc	%o4, %o1, %o4	! 23
-	mulscc	%o4, %o1, %o4	! 24
-	mulscc	%o4, %o1, %o4	! 25
-	mulscc	%o4, %o1, %o4	! 26
-	mulscc	%o4, %o1, %o4	! 27
-	mulscc	%o4, %o1, %o4	! 28
-	mulscc	%o4, %o1, %o4	! 29
-	mulscc	%o4, %o1, %o4	! 30
-	mulscc	%o4, %o1, %o4	! 31
-	mulscc	%o4, %o1, %o4	! 32
-	mulscc	%o4, %g0, %o4	! final shift
-
-	/*
-	 * Normally, with the shift-and-add approach, if both numbers are
-	 * positive you get the correct result.  With 32-bit two's-complement
-	 * numbers, -x is represented as
-	 *
-	 *		  x		    32
-	 *	( 2  -  ------ ) mod 2  *  2
-	 *		   32
-	 *		  2
-	 *
-	 * (the `mod 2' subtracts 1 from 1.bbbb).  To avoid lots of 2^32s,
-	 * we can treat this as if the radix point were just to the left
-	 * of the sign bit (multiply by 2^32), and get
-	 *
-	 *	-x  =  (2 - x) mod 2
-	 *
-	 * Then, ignoring the `mod 2's for convenience:
-	 *
-	 *   x *  y	= xy
-	 *  -x *  y	= 2y - xy
-	 *   x * -y	= 2x - xy
-	 *  -x * -y	= 4 - 2x - 2y + xy
-	 *
-	 * For signed multiplies, we subtract (x << 32) from the partial
-	 * product to fix this problem for negative multipliers (see mul.s).
-	 * Because of the way the shift into the partial product is calculated
-	 * (N xor V), this term is automatically removed for the multiplicand,
-	 * so we don't have to adjust.
-	 *
-	 * But for unsigned multiplies, the high order bit wasn't a sign bit,
-	 * and the correction is wrong.  So for unsigned multiplies where the
-	 * high order bit is one, we end up with xy - (y << 32).  To fix it
-	 * we add y << 32.
-	 */
-#if 0
-	tst	%o1
-	bl,a	1f		! if %o1 < 0 (high order bit = 1),
-	 add	%o4, %o0, %o4	! %o4 += %o0 (add y to upper half)
-1:	rd	%y, %o0		! get lower half of product
-	retl
-	 addcc	%o4, %g0, %o1	! put upper half in place and set Z for %o1==0
-#else
-	/* Faster code from tege@sics.se.  */
-	sra	%o1, 31, %o2	! make mask from sign bit
-	and	%o0, %o2, %o2	! %o2 = 0 or %o0, depending on sign of %o1
-	rd	%y, %o0		! get lower half of product
-	retl
-	 addcc	%o4, %o2, %o1	! add compensation and put upper half in place
-#endif
-
-LOC(mul_shortway):
-	/*
-	 * Short multiply.  12 steps, followed by a final shift step.
-	 * The resulting bits are off by 12 and (32-12) = 20 bit positions,
-	 * but there is no problem with %o0 being negative (unlike above),
-	 * and overflow is impossible (the answer is at most 24 bits long).
-	 */
-	mulscc	%o4, %o1, %o4	! 1
-	mulscc	%o4, %o1, %o4	! 2
-	mulscc	%o4, %o1, %o4	! 3
-	mulscc	%o4, %o1, %o4	! 4
-	mulscc	%o4, %o1, %o4	! 5
-	mulscc	%o4, %o1, %o4	! 6
-	mulscc	%o4, %o1, %o4	! 7
-	mulscc	%o4, %o1, %o4	! 8
-	mulscc	%o4, %o1, %o4	! 9
-	mulscc	%o4, %o1, %o4	! 10
-	mulscc	%o4, %o1, %o4	! 11
-	mulscc	%o4, %o1, %o4	! 12
-	mulscc	%o4, %g0, %o4	! final shift
-
-	/*
-	 * %o4 has 20 of the bits that should be in the result; %y has
-	 * the bottom 12 (as %y's top 12).  That is:
-	 *
-	 *	  %o4		    %y
-	 * +----------------+----------------+
-	 * | -12- |   -20-  | -12- |   -20-  |
-	 * +------(---------+------)---------+
-	 *	   -----result-----
-	 *
-	 * The 12 bits of %o4 left of the `result' area are all zero;
-	 * in fact, all top 20 bits of %o4 are zero.
-	 */
-
-	rd	%y, %o5
-	sll	%o4, 12, %o0	! shift middle bits left 12
-	srl	%o5, 20, %o5	! shift low bits right 20
-	or	%o5, %o0, %o0
+	umul	%o0, %o1, %o0
 	retl
-	 addcc	%g0, %g0, %o1	! %o1 = zero, and set Z
+	 rd	%y, %o1
 
 END(.umul)
diff --git a/sysdeps/sparc/sparc32/urem.S b/sysdeps/sparc/sparc32/urem.S
index 83fb4c242e..cc2689d514 100644
--- a/sysdeps/sparc/sparc32/urem.S
+++ b/sysdeps/sparc/sparc32/urem.S
@@ -1,346 +1,18 @@
-   /* This file is generated from divrem.m4; DO NOT EDIT! */
 /*
- * Division and remainder, from Appendix E of the Sparc Version 8
- * Architecture Manual, with fixes from Gordon Irlam.
+ * Sparc v8 has divide.
  */
 
-/*
- * Input: dividend and divisor in %o0 and %o1 respectively.
- *
- * m4 parameters:
- *  .urem	name of function to generate
- *  rem		rem=div => %o0 / %o1; rem=rem => %o0 % %o1
- *  false		false=true => signed; false=false => unsigned
- *
- * Algorithm parameters:
- *  N		how many bits per iteration we try to get (4)
- *  WORDSIZE	total number of bits (32)
- *
- * Derived constants:
- *  TOPBITS	number of bits in the top decade of a number
- *
- * Important variables:
- *  Q		the partial quotient under development (initially 0)
- *  R		the remainder so far, initially the dividend
- *  ITER	number of main division loop iterations required;
- *		equal to ceil(log2(quotient) / N).  Note that this
- *		is the log base (2^N) of the quotient.
- *  V		the current comparand, initially divisor*2^(ITER*N-1)
- *
- * Cost:
- *  Current estimate for non-large dividend is
- *	ceil(log2(quotient) / N) * (10 + 7N/2) + C
- *  A large dividend is one greater than 2^(31-TOPBITS) and takes a
- *  different path, as the upper bits of the quotient must be developed
- *  one bit at a time.
- */
-
-
-
 #include <sysdep.h>
-#include <sys/trap.h>
 
 ENTRY(.urem)
 
-	! Ready to divide.  Compute size of quotient; scale comparand.
-	orcc	%o1, %g0, %o5
-	bne	1f
-	mov	%o0, %o3
-
-		! Divide by zero trap.  If it returns, return 0 (about as
-		! wrong as possible, but that is what SunOS does...).
-		ta	ST_DIV0
-		retl
-		clr	%o0
-
-1:
-	cmp	%o3, %o5			! if %o1 exceeds %o0, done
-	blu	LOC(got_result)		! (and algorithm fails otherwise)
-	clr	%o2
-	sethi	%hi(1 << (32 - 4 - 1)), %g1
-	cmp	%o3, %g1
-	blu	LOC(not_really_big)
-	clr	%o4
-
-	! Here the dividend is >= 2**(31-N) or so.  We must be careful here,
-	! as our usual N-at-a-shot divide step will cause overflow and havoc.
-	! The number of bits in the result here is N*ITER+SC, where SC <= N.
-	! Compute ITER in an unorthodox manner: know we need to shift V into
-	! the top decade: so do not even bother to compare to R.
-	1:
-		cmp	%o5, %g1
-		bgeu	3f
-		mov	1, %g2
-		sll	%o5, 4, %o5
-		b	1b
-		add	%o4, 1, %o4
-
-	! Now compute %g2.
-	2:	addcc	%o5, %o5, %o5
-		bcc	LOC(not_too_big)
-		add	%g2, 1, %g2
-
-		! We get here if the %o1 overflowed while shifting.
-		! This means that %o3 has the high-order bit set.
-		! Restore %o5 and subtract from %o3.
-		sll	%g1, 4, %g1	! high order bit
-		srl	%o5, 1, %o5		! rest of %o5
-		add	%o5, %g1, %o5
-		b	LOC(do_single_div)
-		sub	%g2, 1, %g2
-
-	LOC(not_too_big):
-	3:	cmp	%o5, %o3
-		blu	2b
-		nop
-		be	LOC(do_single_div)
-		nop
-	/* NB: these are commented out in the V8-Sparc manual as well */
-	/* (I do not understand this) */
-	! %o5 > %o3: went too far: back up 1 step
-	!	srl	%o5, 1, %o5
-	!	dec	%g2
-	! do single-bit divide steps
-	!
-	! We have to be careful here.  We know that %o3 >= %o5, so we can do the
-	! first divide step without thinking.  BUT, the others are conditional,
-	! and are only done if %o3 >= 0.  Because both %o3 and %o5 may have the high-
-	! order bit set in the first step, just falling into the regular
-	! division loop will mess up the first time around.
-	! So we unroll slightly...
-	LOC(do_single_div):
-		subcc	%g2, 1, %g2
-		bl	LOC(end_regular_divide)
-		nop
-		sub	%o3, %o5, %o3
-		mov	1, %o2
-		b	LOC(end_single_divloop)
-		nop
-	LOC(single_divloop):
-		sll	%o2, 1, %o2
-		bl	1f
-		srl	%o5, 1, %o5
-		! %o3 >= 0
-		sub	%o3, %o5, %o3
-		b	2f
-		add	%o2, 1, %o2
-	1:	! %o3 < 0
-		add	%o3, %o5, %o3
-		sub	%o2, 1, %o2
-	2:
-	LOC(end_single_divloop):
-		subcc	%g2, 1, %g2
-		bge	LOC(single_divloop)
-		tst	%o3
-		b,a	LOC(end_regular_divide)
-
-LOC(not_really_big):
-1:
-	sll	%o5, 4, %o5
-	cmp	%o5, %o3
-	bleu	1b
-	addcc	%o4, 1, %o4
-	be	LOC(got_result)
-	sub	%o4, 1, %o4
-
-	tst	%o3	! set up for initial iteration
-LOC(divloop):
-	sll	%o2, 4, %o2
-		! depth 1, accumulated bits 0
-	bl	LOC(1.16)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 2, accumulated bits 1
-	bl	LOC(2.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 3
-	bl	LOC(3.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 7
-	bl	LOC(4.23)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2+1), %o2
-
-LOC(4.23):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (7*2-1), %o2
-
-
-LOC(3.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 5
-	bl	LOC(4.21)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2+1), %o2
-
-LOC(4.21):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (5*2-1), %o2
-
-
-
-LOC(2.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits 1
-	bl	LOC(3.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 3
-	bl	LOC(4.19)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2+1), %o2
-
-LOC(4.19):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (3*2-1), %o2
-
-
-LOC(3.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits 1
-	bl	LOC(4.17)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2+1), %o2
-
-LOC(4.17):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (1*2-1), %o2
-
-
-
-
-LOC(1.16):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 2, accumulated bits -1
-	bl	LOC(2.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -1
-	bl	LOC(3.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -1
-	bl	LOC(4.15)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2+1), %o2
-
-LOC(4.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-1*2-1), %o2
-
-
-LOC(3.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -3
-	bl	LOC(4.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2+1), %o2
-
-LOC(4.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-3*2-1), %o2
-
-
-
-LOC(2.15):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 3, accumulated bits -3
-	bl	LOC(3.13)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -5
-	bl	LOC(4.11)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2+1), %o2
-
-LOC(4.11):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-5*2-1), %o2
-
-
-LOC(3.13):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-			! depth 4, accumulated bits -7
-	bl	LOC(4.9)
-	srl	%o5,1,%o5
-	! remainder is positive
-	subcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2+1), %o2
-
-LOC(4.9):
-	! remainder is negative
-	addcc	%o3,%o5,%o3
-		b	9f
-		add	%o2, (-7*2-1), %o2
-
-
-
-
-9:
-LOC(end_regular_divide):
-	subcc	%o4, 1, %o4
-	bge	LOC(divloop)
-	tst	%o3
-	bl,a	LOC(got_result)
-	! non-restoring fixup here (one instruction only!)
-	add	%o3, %o1, %o3
-
-
-LOC(got_result):
-
+	wr	%g0, 0, %y
+	nop
+	nop
+	nop
+	udiv	%o0, %o1, %o2
+	umul	%o2, %o1, %o2
 	retl
-	mov %o3, %o0
+	 sub	%o0, %o2, %o0
 
 END(.urem)

From patchwork Wed Nov 13 19:28:23 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
X-Patchwork-Id: 179344
Delivered-To: patch@linaro.org
Received: by 2002:a92:38d5:0:0:0:0:0 with SMTP id g82csp10039477ilf;
 Wed, 13 Nov 2019 11:28:55 -0800 (PST)
X-Google-Smtp-Source: APXvYqyJ8drMW/LyobPjkqr83L7KPs6/QblpdGykm21izE7h9aa9i8K3U8i63Xlx32XR8XPx5MAB
X-Received: by 2002:a17:906:c789:: with SMTP id
 cw9mr4652520ejb.40.1573673335805; 
 Wed, 13 Nov 2019 11:28:55 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1573673335; cv=none;
 d=google.com; s=arc-20160816;
 b=H/JWi0IaNsAG6XkQvbJUHJF1VQ0xw9FXOxl6O9dQDfZ7OYMUIJNMUW486nMIO0YDMc
 OlLOeweenLwjeDKNP1gSfoqP+WW+/O2/lNTrrhbWBy4gJlnjUlU2BFPaqY1xGchmhMZv
 94wuw5+vadGsXJXyOqBBMOE0wF4yPrcFu15WqndVMsLnwM+Wc0vJGnqfq+SuqjjIPV8/
 Mn78bAV+WltfsPym6eJaTKOrP2gHEtlQbrMlxbYqPCeEjfFp8d2hUyIFduvgMeNVy132
 RdBtXqgIc0xYtEDssKv6A7ff2dA1h7+L1S11HezLU4ooU7Lw5ZS4D3ezYPZ0TLWb+YwY
 clQA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816; 
 h=references:in-reply-to:message-id:date:subject:cc:to:from
 :dkim-signature:delivered-to:sender:list-help:list-post:list-archive
 :list-subscribe:list-unsubscribe:list-id:precedence:mailing-list
 :dkim-signature:domainkey-signature;
 bh=6hg/HdEH5M/gPaDvvOvkXmdVMVRmqSeaxBmnNcnELfg=;
 b=aeO0Y5hwjg+rcfaFds7Lg51suA7i030YN7eW/m6glgaoi5UwUl7CAfox6JJbWdcUcS
 JBTrsAJALgKbhxEcJ4LdYBbv8yUcQq1EOuKlz+v9LZXcXSEdHCNgIdCT2uLDFdAMl8pg
 Qx1+4YOr7G6EJ7i0tSrGjqC3aFANrbtwUfaHOcioHzZZ9pQFF59X3VlIVFOU9aqyPyQX
 hi62ambAZMId8CjiL3eWXssZ1oMIzfM/4MNZ+v7wQWB6OptUSuyTmg2Tp4DOBfh8NkpS
 nfReMJ4fthWWnPgH2aZck6HdrbUUVq8SwRT9VVkzwQ2iRqfAhEyqsqBnaMSEBEOOTGpt
 hVrQ==
ARC-Authentication-Results: i=1; mx.google.com;
 dkim=pass header.i=@sourceware.org header.s=default header.b=SLcXsWcK;
 dkim=pass header.i=@linaro.org header.s=google header.b=D5cGI5Q3;
 spf=pass (google.com: domain of
 libc-alpha-return-107056-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 smtp.mailfrom="libc-alpha-return-107056-patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <libc-alpha-return-107056-patch=linaro.org@sourceware.org>
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 y16si1811643ejw.142.2019.11.13.11.28.54 for <patch@linaro.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 13 Nov 2019 11:28:55 -0800 (PST)
Received-SPF: pass (google.com: domain of
 libc-alpha-return-107056-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Authentication-Results: mx.google.com;
 dkim=pass header.i=@sourceware.org header.s=default header.b=SLcXsWcK;
 dkim=pass header.i=@linaro.org header.s=google header.b=D5cGI5Q3;
 spf=pass (google.com: domain of
 libc-alpha-return-107056-patch=linaro.org@sourceware.org
 designates 209.132.180.131 as permitted sender)
 smtp.mailfrom="libc-alpha-return-107056-patch=linaro.org@sourceware.org";
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
 :list-unsubscribe:list-subscribe:list-archive:list-post
 :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to
 :references; q=dns; s=default; b=R8HFk724+r10uIebdVhNpZxoK28f8kY
 xoi7roHtGBNFAB6eu1tBU9Xi4zFlLvkA7t0duuFNOHi1tWRXZs92K1qh70Cc2SLw
 suCeIGHD5eOhsjfr583YQwk3wlx+VwZtJDzzkaOnuotPKUEZBT8SioMzVzODPw/o
 eGVpyRd1AXVc=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
 :list-unsubscribe:list-subscribe:list-archive:list-post
 :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to
 :references; s=default; bh=disERR676mm+xAcJZdyrU1lK3b0=; b=SLcXs
 WcKolp4WHorb+J20bDQA9omXqHccEGmo4MdrfD74xQCNuyKWuREjZgU+3D6ovNE2
 3/w3wF+qNCaJD1cMfcCX7AOBN3d9jDgfRZ0SVWkBxNlrLwlFMFtcthgtrdf4hz7T
 PHPRwiV+fFwkXwCywG1ZebdlFSkx/72RND6nio=
Received: (qmail 71304 invoked by alias); 13 Nov 2019 19:28:42 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-patch=linaro.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
 <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 71294 invoked by uid 89); 13 Nov 2019 19:28:41 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-21.6 required=5.0 tests=AWL, BAYES_00,
 GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT,
 SPF_PASS autolearn=ham version=3.3.1 spammy=H*MI:sk:a9e59b0,
 H*f:sk:a9e59b0, HX-HELO:sk:mail-qv, sk:sparc64
X-HELO: mail-qv1-f54.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=6hg/HdEH5M/gPaDvvOvkXmdVMVRmqSeaxBmnNcnELfg=;
 b=D5cGI5Q3nq8d+oZOL6RTAw+w112/y1iXaqhBSEMr81852vkx6LKHXckqO9AFyOFuyF
 sBat5DZ8kD1Cvz5HOdRrU8PqsuezjDFaoizGioZLZ1H2JXnJE+lhWIXZo68mrYqRjRXR
 VvF+p0PzVP99tEpkuQxmWz3nHXMV43uSB5WtjxvoWsXXeUf4cXlHe1tUJoRYdmTixXxw
 10apNFkUuTNJF/48m+QUFdwtEPZseMddF6XPQUOFvlI80q0i/E2JmGBLm6dBmeKS/PHi
 kcLhSONFOKGrbQpJL0PE3FiYH+5UT51DTCalzIC8wYOxl5Hq/Fnhh9RR48Vl6AtyFWg6
 P0HQ==
Return-Path: <adhemerval.zanella@linaro.org>
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: libc-alpha@sourceware.org
Cc: Romain Naour <romain.naour@smile.fr>, Andreas Larsson <andreas@gaisler.com>
Subject: [PATCH 2/2] sparc: Use atomic compiler builtins on sparc
Date: Wed, 13 Nov 2019 16:28:23 -0300
Message-Id: <20191113192823.17109-2-adhemerval.zanella@linaro.org>
In-Reply-To: <20191113192823.17109-1-adhemerval.zanella@linaro.org>
References: <a9e59b02-66e4-e81b-e6ec-90b9cc99b045@linaro.org>
 <20191113192823.17109-1-adhemerval.zanella@linaro.org>

This patch removes the arch-specific atomic instruction, relying on
compiler builtins.  The __sparc32_atomic_locks support is removed
and a configure check is added to check if compiler uses libatomic
to implement CAS.

It also removes the sparc specific sem_* and pthread_barrier_*
implementations.  It in turn allows buidling against a LEON3/LEON4
sparcv8 target, although it will still be incompatible with generic
sparcv9.

Checked on sparcv9-linux-gnu and sparc64-linux-gnu.  I also checked
with build against sparcv8-linux-gnu with -mcpu=leon3.
---
 sysdeps/sparc/Makefile                        |   9 +
 sysdeps/sparc/atomic-machine.h                |  96 +++++
 sysdeps/sparc/{sparc64 => }/cpu_relax.c       |   4 +-
 sysdeps/sparc/sparc32/atomic-machine.h        | 363 ------------------
 sysdeps/sparc/sparc32/configure               |  35 ++
 sysdeps/sparc/sparc32/configure.ac            |  24 ++
 sysdeps/sparc/sparc32/lowlevellock.c          |  53 ---
 sysdeps/sparc/sparc32/pthread_barrier_wait.c  |   1 -
 sysdeps/sparc/sparc32/sem_post.c              |  82 ----
 sysdeps/sparc/sparc32/sem_waitcommon.c        | 146 -------
 sysdeps/sparc/sparc32/sparcv9/Makefile        |   9 -
 .../sparc/sparc32/sparcv9/atomic-machine.h    | 108 ------
 sysdeps/sparc/sparc32/sparcv9/cpu_relax.c     |   1 -
 sysdeps/sparc/sparc64/Makefile                |   9 -
 sysdeps/sparc/sparc64/atomic-machine.h        | 129 -------
 sysdeps/unix/sysv/linux/sparc/lowlevellock.h  | 130 -------
 16 files changed, 167 insertions(+), 1032 deletions(-)
 create mode 100644 sysdeps/sparc/atomic-machine.h
 rename sysdeps/sparc/{sparc64 => }/cpu_relax.c (92%)
 delete mode 100644 sysdeps/sparc/sparc32/atomic-machine.h
 delete mode 100644 sysdeps/sparc/sparc32/lowlevellock.c
 delete mode 100644 sysdeps/sparc/sparc32/pthread_barrier_wait.c
 delete mode 100644 sysdeps/sparc/sparc32/sem_post.c
 delete mode 100644 sysdeps/sparc/sparc32/sem_waitcommon.c
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/atomic-machine.h
 delete mode 100644 sysdeps/sparc/sparc32/sparcv9/cpu_relax.c
 delete mode 100644 sysdeps/sparc/sparc64/atomic-machine.h
 delete mode 100644 sysdeps/unix/sysv/linux/sparc/lowlevellock.h

-- 
2.17.1
Tested-by: Andreas Larsson <andreas@gaisler.com>

diff --git a/sysdeps/sparc/Makefile b/sysdeps/sparc/Makefile
index 3f0c096400..38b33af6e0 100644
--- a/sysdeps/sparc/Makefile
+++ b/sysdeps/sparc/Makefile
@@ -16,5 +16,14 @@ CPPFLAGS-crti.S += -fPIC
 CPPFLAGS-crtn.S += -fPIC
 endif
 
+# nscd uses atomic_spin_nop which in turn requires cpu_relax
+ifeq ($(subdir),nscd)
+routines += cpu_relax
+endif
+
+ifeq ($(subdir), nptl)
+libpthread-routines += cpu_relax
+endif
+
 # The assembler on SPARC needs the -fPIC flag even when it's assembler code.
 ASFLAGS-.os += -fPIC
diff --git a/sysdeps/sparc/atomic-machine.h b/sysdeps/sparc/atomic-machine.h
new file mode 100644
index 0000000000..e0bb9b61b9
--- /dev/null
+++ b/sysdeps/sparc/atomic-machine.h
@@ -0,0 +1,96 @@
+/* Atomic operations.  sparc32 version.
+   Copyright (C) 2003-2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _ATOMIC_MACHINE_H
+#define _ATOMIC_MACHINE_H	1
+
+#include <stdint.h>
+
+typedef int8_t atomic8_t;
+typedef uint8_t uatomic8_t;
+typedef int_fast8_t atomic_fast8_t;
+typedef uint_fast8_t uatomic_fast8_t;
+
+typedef int16_t atomic16_t;
+typedef uint16_t uatomic16_t;
+typedef int_fast16_t atomic_fast16_t;
+typedef uint_fast16_t uatomic_fast16_t;
+
+typedef int32_t atomic32_t;
+typedef uint32_t uatomic32_t;
+typedef int_fast32_t atomic_fast32_t;
+typedef uint_fast32_t uatomic_fast32_t;
+
+typedef int64_t atomic64_t;
+typedef uint64_t uatomic64_t;
+typedef int_fast64_t atomic_fast64_t;
+typedef uint_fast64_t uatomic_fast64_t;
+
+typedef intptr_t atomicptr_t;
+typedef uintptr_t uatomicptr_t;
+typedef intmax_t atomic_max_t;
+typedef uintmax_t uatomic_max_t;
+
+#ifdef __arch64__
+# define __HAVE_64B_ATOMICS          1
+#else
+# define __HAVE_64B_ATOMICS          0
+#endif
+#define USE_ATOMIC_COMPILER_BUILTINS 1
+
+/* XXX Is this actually correct?  */
+#define ATOMIC_EXCHANGE_USES_CAS     __HAVE_64B_ATOMICS
+
+/* Compare and exchange.
+   For all "bool" routines, we return FALSE if exchange succesful.  */
+
+#define __arch_compare_and_exchange_val_int(mem, newval, oldval, model) \
+  ({									\
+    typeof (*mem) __oldval = (oldval);					\
+    __atomic_compare_exchange_n (mem, (void *) &__oldval, newval, 0,	\
+				 model, __ATOMIC_RELAXED);		\
+    __oldval;								\
+  })
+
+#define atomic_compare_and_exchange_val_acq(mem, new, old)		      \
+  ({									      \
+    __typeof ((__typeof (*(mem))) *(mem)) __result;			      \
+    if (sizeof (*mem) == 4						      \
+        || (__HAVE_64B_ATOMICS && sizeof (*mem) == 8))			      \
+      __result = __arch_compare_and_exchange_val_int (mem, new, old,	      \
+							 __ATOMIC_ACQUIRE);   \
+    else								      \
+      abort ();								      \
+    __result;								      \
+  })
+
+#ifdef __sparc_v9__
+# define atomic_full_barrier() \
+  __asm __volatile ("membar #LoadLoad | #LoadStore"			      \
+		    " | #StoreLoad | #StoreStore" : : : "memory")
+# define atomic_read_barrier() \
+  __asm __volatile ("membar #LoadLoad | #LoadStore" : : : "memory")
+# define atomic_write_barrier() \
+  __asm __volatile ("membar #LoadStore | #StoreStore" : : : "memory")
+
+extern void __cpu_relax (void);
+# define atomic_spin_nop() __cpu_relax ()
+#endif
+
+#endif /* _ATOMIC_MACHINE_H  */
diff --git a/sysdeps/sparc/sparc64/cpu_relax.c b/sysdeps/sparc/cpu_relax.c
similarity index 92%
rename from sysdeps/sparc/sparc64/cpu_relax.c
rename to sysdeps/sparc/cpu_relax.c
index b86afdec2c..973dfb5935 100644
--- a/sysdeps/sparc/sparc64/cpu_relax.c
+++ b/sysdeps/sparc/cpu_relax.c
@@ -1,4 +1,4 @@
-/* CPU strand yielding for busy loops.  Linux/sparc64 version.
+/* CPU strand yielding for busy loops.  Linux/sparc version.
    Copyright (C) 2017-2019 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
@@ -18,6 +18,7 @@
 
 #include <sparc-ifunc.h>
 
+#ifdef __sparc_v9__
 static void
 __cpu_relax_generic (void)
 {
@@ -36,3 +37,4 @@ sparc_libc_ifunc (__cpu_relax,
 		  hwcap & HWCAP_SPARC_PAUSE
 		  ? __cpu_relax_pause
 		  : __cpu_relax_generic)
+#endif
diff --git a/sysdeps/sparc/sparc32/atomic-machine.h b/sysdeps/sparc/sparc32/atomic-machine.h
deleted file mode 100644
index d82a4f4ecc..0000000000
--- a/sysdeps/sparc/sparc32/atomic-machine.h
+++ /dev/null
@@ -1,363 +0,0 @@
-/* Atomic operations.  sparc32 version.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#ifndef _ATOMIC_MACHINE_H
-#define _ATOMIC_MACHINE_H	1
-
-#include <stdint.h>
-
-typedef int8_t atomic8_t;
-typedef uint8_t uatomic8_t;
-typedef int_fast8_t atomic_fast8_t;
-typedef uint_fast8_t uatomic_fast8_t;
-
-typedef int16_t atomic16_t;
-typedef uint16_t uatomic16_t;
-typedef int_fast16_t atomic_fast16_t;
-typedef uint_fast16_t uatomic_fast16_t;
-
-typedef int32_t atomic32_t;
-typedef uint32_t uatomic32_t;
-typedef int_fast32_t atomic_fast32_t;
-typedef uint_fast32_t uatomic_fast32_t;
-
-typedef int64_t atomic64_t;
-typedef uint64_t uatomic64_t;
-typedef int_fast64_t atomic_fast64_t;
-typedef uint_fast64_t uatomic_fast64_t;
-
-typedef intptr_t atomicptr_t;
-typedef uintptr_t uatomicptr_t;
-typedef intmax_t atomic_max_t;
-typedef uintmax_t uatomic_max_t;
-
-#define __HAVE_64B_ATOMICS 0
-#define USE_ATOMIC_COMPILER_BUILTINS 0
-
-/* XXX Is this actually correct?  */
-#define ATOMIC_EXCHANGE_USES_CAS 1
-
-
-/* We have no compare and swap, just test and set.
-   The following implementation contends on 64 global locks
-   per library and assumes no variable will be accessed using atomic.h
-   macros from two different libraries.  */
-
-__make_section_unallocated
-  (".gnu.linkonce.b.__sparc32_atomic_locks, \"aw\", %nobits");
-
-volatile unsigned char __sparc32_atomic_locks[64]
-  __attribute__ ((nocommon, section (".gnu.linkonce.b.__sparc32_atomic_locks"
-				     __sec_comment),
-		  visibility ("hidden")));
-
-#define __sparc32_atomic_do_lock(addr) \
-  do								      \
-    {								      \
-      unsigned int __old_lock;					      \
-      unsigned int __idx = (((long) addr >> 2) ^ ((long) addr >> 12)) \
-			   & 63;				      \
-      do							      \
-	__asm __volatile ("ldstub %1, %0"			      \
-			  : "=r" (__old_lock),			      \
-			    "=m" (__sparc32_atomic_locks[__idx])      \
-			  : "m" (__sparc32_atomic_locks[__idx])	      \
-			  : "memory");				      \
-      while (__old_lock);					      \
-    }								      \
-  while (0)
-
-#define __sparc32_atomic_do_unlock(addr) \
-  do								      \
-    {								      \
-      __sparc32_atomic_locks[(((long) addr >> 2)		      \
-			      ^ ((long) addr >> 12)) & 63] = 0;	      \
-      __asm __volatile ("" ::: "memory");			      \
-    }								      \
-  while (0)
-
-#define __sparc32_atomic_do_lock24(addr) \
-  do								      \
-    {								      \
-      unsigned int __old_lock;					      \
-      do							      \
-	__asm __volatile ("ldstub %1, %0"			      \
-			  : "=r" (__old_lock), "=m" (*(addr))	      \
-			  : "m" (*(addr))			      \
-			  : "memory");				      \
-      while (__old_lock);					      \
-    }								      \
-  while (0)
-
-#define __sparc32_atomic_do_unlock24(addr) \
-  do								      \
-    {								      \
-      __asm __volatile ("" ::: "memory");			      \
-      *(char *) (addr) = 0;					      \
-    }								      \
-  while (0)
-
-
-#ifndef SHARED
-# define __v9_compare_and_exchange_val_32_acq(mem, newval, oldval) \
-({union { __typeof (oldval) a; uint32_t v; } oldval_arg = { .a = (oldval) };  \
-  union { __typeof (newval) a; uint32_t v; } newval_arg = { .a = (newval) };  \
-  register uint32_t __acev_tmp __asm ("%g6");			              \
-  register __typeof (mem) __acev_mem __asm ("%g1") = (mem);		      \
-  register uint32_t __acev_oldval __asm ("%g5");		              \
-  __acev_tmp = newval_arg.v;						      \
-  __acev_oldval = oldval_arg.v;						      \
-  /* .word 0xcde05005 is cas [%g1], %g5, %g6.  Can't use cas here though,     \
-     because as will then mark the object file as V8+ arch.  */		      \
-  __asm __volatile (".word 0xcde05005"					      \
-		    : "+r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		    : "r" (__acev_oldval), "m" (*__acev_mem),		      \
-		      "r" (__acev_mem) : "memory");			      \
-  (__typeof (oldval)) __acev_tmp; })
-#endif
-
-/* The only basic operation needed is compare and exchange.  */
-#define __v7_compare_and_exchange_val_acq(mem, newval, oldval) \
-  ({ __typeof (mem) __acev_memp = (mem);			      \
-     __typeof (*mem) __acev_ret;				      \
-     __typeof (*mem) __acev_newval = (newval);			      \
-								      \
-     __sparc32_atomic_do_lock (__acev_memp);			      \
-     __acev_ret = *__acev_memp;					      \
-     if (__acev_ret == (oldval))				      \
-       *__acev_memp = __acev_newval;				      \
-     __sparc32_atomic_do_unlock (__acev_memp);			      \
-     __acev_ret; })
-
-#define __v7_compare_and_exchange_bool_acq(mem, newval, oldval) \
-  ({ __typeof (mem) __aceb_memp = (mem);			      \
-     int __aceb_ret;						      \
-     __typeof (*mem) __aceb_newval = (newval);			      \
-								      \
-     __sparc32_atomic_do_lock (__aceb_memp);			      \
-     __aceb_ret = 0;						      \
-     if (*__aceb_memp == (oldval))				      \
-       *__aceb_memp = __aceb_newval;				      \
-     else							      \
-       __aceb_ret = 1;						      \
-     __sparc32_atomic_do_unlock (__aceb_memp);			      \
-     __aceb_ret; })
-
-#define __v7_exchange_acq(mem, newval) \
-  ({ __typeof (mem) __acev_memp = (mem);			      \
-     __typeof (*mem) __acev_ret;				      \
-     __typeof (*mem) __acev_newval = (newval);			      \
-								      \
-     __sparc32_atomic_do_lock (__acev_memp);			      \
-     __acev_ret = *__acev_memp;					      \
-     *__acev_memp = __acev_newval;				      \
-     __sparc32_atomic_do_unlock (__acev_memp);			      \
-     __acev_ret; })
-
-#define __v7_exchange_and_add(mem, value) \
-  ({ __typeof (mem) __acev_memp = (mem);			      \
-     __typeof (*mem) __acev_ret;				      \
-								      \
-     __sparc32_atomic_do_lock (__acev_memp);			      \
-     __acev_ret = *__acev_memp;					      \
-     *__acev_memp = __acev_ret + (value);			      \
-     __sparc32_atomic_do_unlock (__acev_memp);			      \
-     __acev_ret; })
-
-/* Special versions, which guarantee that top 8 bits of all values
-   are cleared and use those bits as the ldstub lock.  */
-#define __v7_compare_and_exchange_val_24_acq(mem, newval, oldval) \
-  ({ __typeof (mem) __acev_memp = (mem);			      \
-     __typeof (*mem) __acev_ret;				      \
-     __typeof (*mem) __acev_newval = (newval);			      \
-								      \
-     __sparc32_atomic_do_lock24 (__acev_memp);			      \
-     __acev_ret = *__acev_memp & 0xffffff;			      \
-     if (__acev_ret == (oldval))				      \
-       *__acev_memp = __acev_newval;				      \
-     else							      \
-       __sparc32_atomic_do_unlock24 (__acev_memp);		      \
-     __asm __volatile ("" ::: "memory");			      \
-     __acev_ret; })
-
-#define __v7_exchange_24_rel(mem, newval) \
-  ({ __typeof (mem) __acev_memp = (mem);			      \
-     __typeof (*mem) __acev_ret;				      \
-     __typeof (*mem) __acev_newval = (newval);			      \
-								      \
-     __sparc32_atomic_do_lock24 (__acev_memp);			      \
-     __acev_ret = *__acev_memp & 0xffffff;			      \
-     *__acev_memp = __acev_newval;				      \
-     __asm __volatile ("" ::: "memory");			      \
-     __acev_ret; })
-
-#ifdef SHARED
-
-/* When dynamically linked, we assume pre-v9 libraries are only ever
-   used on pre-v9 CPU.  */
-# define __atomic_is_v9 0
-
-# define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
-  __v7_compare_and_exchange_val_acq (mem, newval, oldval)
-
-# define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \
-  __v7_compare_and_exchange_bool_acq (mem, newval, oldval)
-
-# define atomic_exchange_acq(mem, newval) \
-  __v7_exchange_acq (mem, newval)
-
-# define atomic_exchange_and_add(mem, value) \
-  __v7_exchange_and_add (mem, value)
-
-# define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \
-  ({								      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     __v7_compare_and_exchange_val_24_acq (mem, newval, oldval); })
-
-# define atomic_exchange_24_rel(mem, newval) \
-  ({								      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     __v7_exchange_24_rel (mem, newval); })
-
-# define atomic_full_barrier() __asm ("" ::: "memory")
-# define atomic_read_barrier() atomic_full_barrier ()
-# define atomic_write_barrier() atomic_full_barrier ()
-
-#else
-
-/* In libc.a/libpthread.a etc. we don't know if we'll be run on
-   pre-v9 or v9 CPU.  To be interoperable with dynamically linked
-   apps on v9 CPUs e.g. with process shared primitives, use cas insn
-   on v9 CPUs and ldstub on pre-v9.  */
-
-extern uint64_t _dl_hwcap __attribute__((weak));
-# define __atomic_is_v9 \
-  (__builtin_expect (&_dl_hwcap != 0, 1) \
-   && __builtin_expect (_dl_hwcap & HWCAP_SPARC_V9, HWCAP_SPARC_V9))
-
-# define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
-  ({								      \
-     __typeof (*mem) __acev_wret;				      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     if (__atomic_is_v9)					      \
-       __acev_wret						      \
-	 = __v9_compare_and_exchange_val_32_acq (mem, newval, oldval);\
-     else							      \
-       __acev_wret						      \
-	 = __v7_compare_and_exchange_val_acq (mem, newval, oldval);   \
-     __acev_wret; })
-
-# define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \
-  ({								      \
-     int __acev_wret;						      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     if (__atomic_is_v9)					      \
-       {							      \
-	 __typeof (oldval) __acev_woldval = (oldval);		      \
-	 __acev_wret						      \
-	   = __v9_compare_and_exchange_val_32_acq (mem, newval,	      \
-						   __acev_woldval)    \
-	     != __acev_woldval;					      \
-       }							      \
-     else							      \
-       __acev_wret						      \
-	 = __v7_compare_and_exchange_bool_acq (mem, newval, oldval);  \
-     __acev_wret; })
-
-# define atomic_exchange_rel(mem, newval) \
-  ({								      \
-     __typeof (*mem) __acev_wret;				      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     if (__atomic_is_v9)					      \
-       {							      \
-	 __typeof (mem) __acev_wmemp = (mem);			      \
-	 __typeof (*(mem)) __acev_wval = (newval);		      \
-	 do							      \
-	   __acev_wret = *__acev_wmemp;				      \
-	 while (__builtin_expect				      \
-		  (__v9_compare_and_exchange_val_32_acq (__acev_wmemp,\
-							 __acev_wval, \
-							 __acev_wret) \
-		   != __acev_wret, 0));				      \
-       }							      \
-     else							      \
-       __acev_wret = __v7_exchange_acq (mem, newval);		      \
-     __acev_wret; })
-
-# define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \
-  ({								      \
-     __typeof (*mem) __acev_wret;				      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     if (__atomic_is_v9)					      \
-       __acev_wret						      \
-	 = __v9_compare_and_exchange_val_32_acq (mem, newval, oldval);\
-     else							      \
-       __acev_wret						      \
-	 = __v7_compare_and_exchange_val_24_acq (mem, newval, oldval);\
-     __acev_wret; })
-
-# define atomic_exchange_24_rel(mem, newval) \
-  ({								      \
-     __typeof (*mem) __acev_w24ret;				      \
-     if (sizeof (*mem) != 4)					      \
-       abort ();						      \
-     if (__atomic_is_v9)					      \
-       __acev_w24ret = atomic_exchange_rel (mem, newval);	      \
-     else							      \
-       __acev_w24ret = __v7_exchange_24_rel (mem, newval);	      \
-     __acev_w24ret; })
-
-#define atomic_full_barrier()						\
-  do {									\
-     if (__atomic_is_v9)						\
-       /* membar #LoadLoad | #LoadStore | #StoreLoad | #StoreStore */	\
-       __asm __volatile (".word 0x8143e00f" : : : "memory");		\
-     else								\
-       __asm __volatile ("" : : : "memory");				\
-  } while (0)
-
-#define atomic_read_barrier()						\
-  do {									\
-     if (__atomic_is_v9)						\
-       /* membar #LoadLoad | #LoadStore */				\
-       __asm __volatile (".word 0x8143e005" : : : "memory");		\
-     else								\
-       __asm __volatile ("" : : : "memory");				\
-  } while (0)
-
-#define atomic_write_barrier()						\
-  do {									\
-     if (__atomic_is_v9)						\
-       /* membar  #LoadStore | #StoreStore */				\
-       __asm __volatile (".word 0x8143e00c" : : : "memory");		\
-     else								\
-       __asm __volatile ("" : : : "memory");				\
-  } while (0)
-
-#endif
-
-#include <sysdep.h>
-
-#endif	/* atomic-machine.h */
diff --git a/sysdeps/sparc/sparc32/configure b/sysdeps/sparc/sparc32/configure
index 337531fe94..98f94991dc 100644
--- a/sysdeps/sparc/sparc32/configure
+++ b/sysdeps/sparc/sparc32/configure
@@ -160,3 +160,38 @@ $as_echo "$libc_cv_sparcv8" >&6; }
 if test $libc_cv_sparcv8 = no; then
   as_fn_error $? "no support for pre-v8 sparc" "$LINENO" 5
 fi
+
+# Test if compiler generates external calls to libatomic for CAS operation.
+# It is suffice to check for int only and the test is similar of C11
+# atomic_compare_exchange_strong using GCC builtins.
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for external libatomic calls" >&5
+$as_echo_n "checking for external libatomic calls... " >&6; }
+if ${libc_cv_uses_libatomic+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat > conftest.c <<EOF
+	       _Bool foo (int *ptr, int *expected, int desired)
+	       {
+		 return __atomic_compare_exchange_n (ptr, expected, desired, 0,
+						     __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
+	       }
+EOF
+	       libc_cv_use_libatomic=no
+	       if { ac_try='${CC-cc} -S conftest.c -o conftest.s 1>&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }; then
+		 if grep '__atomic_compare_exchange_4' conftest.s >/dev/null; then
+		   libc_cv_use_libatomic=yes
+		 fi
+	       fi
+	       rm -f conftest.c conftest.s
+
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_uses_libatomic" >&5
+$as_echo "$libc_cv_uses_libatomic" >&6; }
+if test $libc_cv_use_libatomic = yes; then
+  as_fn_error $? "external dependency of libatomic is not supported" "$LINENO" 5
+fi
diff --git a/sysdeps/sparc/sparc32/configure.ac b/sysdeps/sparc/sparc32/configure.ac
index 99229bfadb..38909c2934 100644
--- a/sysdeps/sparc/sparc32/configure.ac
+++ b/sysdeps/sparc/sparc32/configure.ac
@@ -11,3 +11,27 @@ AC_CACHE_CHECK([for at least sparcv8 support],
 if test $libc_cv_sparcv8 = no; then
   AC_MSG_ERROR([no support for pre-v8 sparc])
 fi
+
+# Test if compiler generates external calls to libatomic for CAS operation.
+# It is suffice to check for int only and the test is similar of C11
+# atomic_compare_exchange_strong using GCC builtins.
+AC_CACHE_CHECK(for external libatomic calls,
+	       libc_cv_uses_libatomic,
+	       [cat > conftest.c <<EOF
+	       _Bool foo (int *ptr, int *expected, int desired)
+	       {
+		 return __atomic_compare_exchange_n (ptr, expected, desired, 0,
+						     __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
+	       }
+EOF
+	       libc_cv_use_libatomic=no
+	       if AC_TRY_COMMAND(${CC-cc} -S conftest.c -o conftest.s 1>&AS_MESSAGE_LOG_FD); then
+		 if grep '__atomic_compare_exchange_4' conftest.s >/dev/null; then
+		   libc_cv_use_libatomic=yes
+		 fi
+	       fi
+	       rm -f conftest.c conftest.s
+	       ])
+if test $libc_cv_use_libatomic = yes; then
+  AC_MSG_ERROR([external dependency of libatomic is not supported])
+fi
diff --git a/sysdeps/sparc/sparc32/lowlevellock.c b/sysdeps/sparc/sparc32/lowlevellock.c
deleted file mode 100644
index 1a0ab4d9f2..0000000000
--- a/sysdeps/sparc/sparc32/lowlevellock.c
+++ /dev/null
@@ -1,53 +0,0 @@
-/* low level locking for pthread library.  SPARC version.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Paul Mackerras <paulus@au.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <errno.h>
-#include <sysdep.h>
-#include <lowlevellock.h>
-#include <sys/time.h>
-#include <time.h>
-
-
-void
-__lll_lock_wait_private (int *futex)
-{
-  do
-    {
-      int oldval = atomic_compare_and_exchange_val_24_acq (futex, 2, 1);
-      if (oldval != 0)
-	lll_futex_wait (futex, 2, LLL_PRIVATE);
-    }
-  while (atomic_compare_and_exchange_val_24_acq (futex, 2, 0) != 0);
-}
-
-
-/* These functions don't get included in libc.so  */
-#if IS_IN (libpthread)
-void
-__lll_lock_wait (int *futex, int private)
-{
-  do
-    {
-      int oldval = atomic_compare_and_exchange_val_24_acq (futex, 2, 1);
-      if (oldval != 0)
-	lll_futex_wait (futex, 2, private);
-    }
-  while (atomic_compare_and_exchange_val_24_acq (futex, 2, 0) != 0);
-}
-#endif
diff --git a/sysdeps/sparc/sparc32/pthread_barrier_wait.c b/sysdeps/sparc/sparc32/pthread_barrier_wait.c
deleted file mode 100644
index e5ef911f62..0000000000
--- a/sysdeps/sparc/sparc32/pthread_barrier_wait.c
+++ /dev/null
@@ -1 +0,0 @@
-#error No support for pthread barriers on pre-v9 sparc.
diff --git a/sysdeps/sparc/sparc32/sem_post.c b/sysdeps/sparc/sparc32/sem_post.c
deleted file mode 100644
index 5389725f83..0000000000
--- a/sysdeps/sparc/sparc32/sem_post.c
+++ /dev/null
@@ -1,82 +0,0 @@
-/* sem_post -- post to a POSIX semaphore.  Generic futex-using version.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <atomic.h>
-#include <errno.h>
-#include <sysdep.h>
-#include <lowlevellock.h>
-#include <internaltypes.h>
-#include <semaphore.h>
-#include <futex-internal.h>
-
-#include <shlib-compat.h>
-
-
-/* See sem_wait for an explanation of the algorithm.  */
-int
-__new_sem_post (sem_t *sem)
-{
-  struct new_sem *isem = (struct new_sem *) sem;
-  int private = isem->private;
-  unsigned int v;
-
-  __sparc32_atomic_do_lock24 (&isem->pad);
-
-  v = isem->value;
-  if ((v >> SEM_VALUE_SHIFT) == SEM_VALUE_MAX)
-    {
-      __sparc32_atomic_do_unlock24 (&isem->pad);
-
-      __set_errno (EOVERFLOW);
-      return -1;
-    }
-  isem->value = v + (1 << SEM_VALUE_SHIFT);
-
-  __sparc32_atomic_do_unlock24 (&isem->pad);
-
-  if ((v & SEM_NWAITERS_MASK) != 0)
-    futex_wake (&isem->value, 1, private);
-
-  return 0;
-}
-versioned_symbol (libpthread, __new_sem_post, sem_post, GLIBC_2_1);
-
-
-#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_1)
-int
-attribute_compat_text_section
-__old_sem_post (sem_t *sem)
-{
-  int *futex = (int *) sem;
-
-  /* We must need to synchronize with consumers of this token, so the atomic
-     increment must have release MO semantics.  */
-  atomic_write_barrier ();
-  (void) atomic_increment_val (futex);
-  /* We always have to assume it is a shared semaphore.  */
-  int err = lll_futex_wake (futex, 1, LLL_SHARED);
-  if (__builtin_expect (err, 0) < 0)
-    {
-      __set_errno (-err);
-      return -1;
-    }
-  return 0;
-}
-compat_symbol (libpthread, __old_sem_post, sem_post, GLIBC_2_0);
-#endif
diff --git a/sysdeps/sparc/sparc32/sem_waitcommon.c b/sysdeps/sparc/sparc32/sem_waitcommon.c
deleted file mode 100644
index 907101bb06..0000000000
--- a/sysdeps/sparc/sparc32/sem_waitcommon.c
+++ /dev/null
@@ -1,146 +0,0 @@
-/* sem_waitcommon -- wait on a semaphore, shared code.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Paul Mackerras <paulus@au.ibm.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <errno.h>
-#include <sysdep.h>
-#include <futex-internal.h>
-#include <internaltypes.h>
-#include <semaphore.h>
-#include <sys/time.h>
-
-#include <pthreadP.h>
-#include <shlib-compat.h>
-#include <atomic.h>
-
-
-static void
-__sem_wait_32_finish (struct new_sem *sem);
-
-static void
-__sem_wait_cleanup (void *arg)
-{
-  struct new_sem *sem = (struct new_sem *) arg;
-
-  __sem_wait_32_finish (sem);
-}
-
-/* Wait until at least one token is available, possibly with a timeout.
-   This is in a separate function in order to make sure gcc
-   puts the call site into an exception region, and thus the
-   cleanups get properly run.  TODO still necessary?  Other futex_wait
-   users don't seem to need it.  */
-static int
-__attribute__ ((noinline))
-do_futex_wait (struct new_sem *sem, const struct timespec *abstime)
-{
-  int err;
-
-  err = futex_abstimed_wait_cancelable (&sem->value, SEM_NWAITERS_MASK,
-					abstime, sem->private);
-
-  return err;
-}
-
-/* Fast path: Try to grab a token without blocking.  */
-static int
-__new_sem_wait_fast (struct new_sem *sem, int definitive_result)
-{
-  unsigned int v;
-  int ret = 0;
-
-  __sparc32_atomic_do_lock24(&sem->pad);
-
-  v = sem->value;
-  if ((v >> SEM_VALUE_SHIFT) == 0)
-    ret = -1;
-  else
-    sem->value = v - (1 << SEM_VALUE_SHIFT);
-
-  __sparc32_atomic_do_unlock24(&sem->pad);
-
-  return ret;
-}
-
-/* Slow path that blocks.  */
-static int
-__attribute__ ((noinline))
-__new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime)
-{
-  unsigned int v;
-  int err = 0;
-
-  __sparc32_atomic_do_lock24(&sem->pad);
-
-  sem->nwaiters++;
-
-  pthread_cleanup_push (__sem_wait_cleanup, sem);
-
-  /* Wait for a token to be available.  Retry until we can grab one.  */
-  v = sem->value;
-  do
-    {
-      if (!(v & SEM_NWAITERS_MASK))
-	sem->value = v | SEM_NWAITERS_MASK;
-
-      /* If there is no token, wait.  */
-      if ((v >> SEM_VALUE_SHIFT) == 0)
-	{
-	  __sparc32_atomic_do_unlock24(&sem->pad);
-
-	  err = do_futex_wait(sem, abstime);
-	  if (err == ETIMEDOUT || err == EINTR)
-	    {
-	      __set_errno (err);
-	      err = -1;
-	      goto error;
-	    }
-	  err = 0;
-
-	  __sparc32_atomic_do_lock24(&sem->pad);
-
-	  /* We blocked, so there might be a token now.  */
-	  v = sem->value;
-	}
-    }
-  /* If there is no token, we must not try to grab one.  */
-  while ((v >> SEM_VALUE_SHIFT) == 0);
-
-  sem->value = v - (1 << SEM_VALUE_SHIFT);
-
-  __sparc32_atomic_do_unlock24(&sem->pad);
-
-error:
-  pthread_cleanup_pop (0);
-
-  __sem_wait_32_finish (sem);
-
-  return err;
-}
-
-/* Stop being a registered waiter (non-64b-atomics code only).  */
-static void
-__sem_wait_32_finish (struct new_sem *sem)
-{
-  __sparc32_atomic_do_lock24(&sem->pad);
-
-  if (--sem->nwaiters == 0)
-    sem->value &= ~SEM_NWAITERS_MASK;
-
-  __sparc32_atomic_do_unlock24(&sem->pad);
-}
diff --git a/sysdeps/sparc/sparc32/sparcv9/Makefile b/sysdeps/sparc/sparc32/sparcv9/Makefile
index 45507eabdb..c49b3bea99 100644
--- a/sysdeps/sparc/sparc32/sparcv9/Makefile
+++ b/sysdeps/sparc/sparc32/sparcv9/Makefile
@@ -4,12 +4,3 @@ ASFLAGS-.o += -Wa,-Av9d
 ASFLAGS-.os += -Wa,-Av9d
 ASFLAGS-.op += -Wa,-Av9d
 ASFLAGS-.oS += -Wa,-Av9d
-
-# nscd uses atomic_spin_nop which in turn requires cpu_relax
-ifeq ($(subdir),nscd)
-routines += cpu_relax
-endif
-
-ifeq ($(subdir), nptl)
-libpthread-routines += cpu_relax
-endif
diff --git a/sysdeps/sparc/sparc32/sparcv9/atomic-machine.h b/sysdeps/sparc/sparc32/sparcv9/atomic-machine.h
deleted file mode 100644
index 8d64f0729a..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/atomic-machine.h
+++ /dev/null
@@ -1,108 +0,0 @@
-/* Atomic operations.  sparcv9 version.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <stdint.h>
-
-typedef int8_t atomic8_t;
-typedef uint8_t uatomic8_t;
-typedef int_fast8_t atomic_fast8_t;
-typedef uint_fast8_t uatomic_fast8_t;
-
-typedef int16_t atomic16_t;
-typedef uint16_t uatomic16_t;
-typedef int_fast16_t atomic_fast16_t;
-typedef uint_fast16_t uatomic_fast16_t;
-
-typedef int32_t atomic32_t;
-typedef uint32_t uatomic32_t;
-typedef int_fast32_t atomic_fast32_t;
-typedef uint_fast32_t uatomic_fast32_t;
-
-typedef int64_t atomic64_t;
-typedef uint64_t uatomic64_t;
-typedef int_fast64_t atomic_fast64_t;
-typedef uint_fast64_t uatomic_fast64_t;
-
-typedef intptr_t atomicptr_t;
-typedef uintptr_t uatomicptr_t;
-typedef intmax_t atomic_max_t;
-typedef uintmax_t uatomic_max_t;
-
-#define __HAVE_64B_ATOMICS 0
-#define USE_ATOMIC_COMPILER_BUILTINS 0
-
-/* XXX Is this actually correct?  */
-#define ATOMIC_EXCHANGE_USES_CAS 0
-
-
-#define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
-  (abort (), (__typeof (*mem)) 0)
-
-#define __arch_compare_and_exchange_val_16_acq(mem, newval, oldval) \
-  (abort (), (__typeof (*mem)) 0)
-
-#define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \
-({									      \
-  __typeof (*(mem)) __acev_tmp;						      \
-  __typeof (mem) __acev_mem = (mem);					      \
-  if (__builtin_constant_p (oldval) && (oldval) == 0)			      \
-    __asm __volatile ("cas [%3], %%g0, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "m" (*__acev_mem), "r" (__acev_mem),		      \
-		        "0" (newval) : "memory");			      \
-  else									      \
-    __asm __volatile ("cas [%4], %2, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "r" (oldval), "m" (*__acev_mem), "r" (__acev_mem),    \
-		        "0" (newval) : "memory");			      \
-  __acev_tmp; })
-
-/* This can be implemented if needed.  */
-#define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \
-  (abort (), (__typeof (*mem)) 0)
-
-#define atomic_exchange_acq(mem, newvalue) \
-  ({ __typeof (*(mem)) __oldval;					      \
-     __typeof (mem) __memp = (mem);					      \
-     __typeof (*(mem)) __value = (newvalue);				      \
-									      \
-     if (sizeof (*(mem)) == 4)						      \
-       __asm ("swap %0, %1"						      \
-	      : "=m" (*__memp), "=r" (__oldval)				      \
-	      : "m" (*__memp), "1" (__value) : "memory");		      \
-     else								      \
-       abort ();							      \
-     __oldval; })
-
-#define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \
-  atomic_compare_and_exchange_val_acq (mem, newval, oldval)
-
-#define atomic_exchange_24_rel(mem, newval) \
-  atomic_exchange_rel (mem, newval)
-
-#define atomic_full_barrier() \
-  __asm __volatile ("membar #LoadLoad | #LoadStore"			      \
-			  " | #StoreLoad | #StoreStore" : : : "memory")
-#define atomic_read_barrier() \
-  __asm __volatile ("membar #LoadLoad | #LoadStore" : : : "memory")
-#define atomic_write_barrier() \
-  __asm __volatile ("membar #LoadStore | #StoreStore" : : : "memory")
-
-extern void __cpu_relax (void);
-#define atomic_spin_nop() __cpu_relax ()
diff --git a/sysdeps/sparc/sparc32/sparcv9/cpu_relax.c b/sysdeps/sparc/sparc32/sparcv9/cpu_relax.c
deleted file mode 100644
index 1670cf6558..0000000000
--- a/sysdeps/sparc/sparc32/sparcv9/cpu_relax.c
+++ /dev/null
@@ -1 +0,0 @@
-#include <sysdeps/sparc/sparc64/cpu_relax.c>
diff --git a/sysdeps/sparc/sparc64/Makefile b/sysdeps/sparc/sparc64/Makefile
index 7ed4c357bc..7e3b62a595 100644
--- a/sysdeps/sparc/sparc64/Makefile
+++ b/sysdeps/sparc/sparc64/Makefile
@@ -29,15 +29,6 @@ ASFLAGS-.os += -Wa,-Av9d
 ASFLAGS-.op += -Wa,-Av9d
 ASFLAGS-.oS += -Wa,-Av9d
 
-# nscd uses atomic_spin_nop which in turn requires cpu_relax
-ifeq ($(subdir),nscd)
-routines += cpu_relax
-endif
-
-ifeq ($(subdir),nptl)
-libpthread-routines += cpu_relax
-endif
-
 ifeq ($(subdir),soft-fp)
 sparc64-quad-routines := qp_add qp_cmp qp_cmpe qp_div qp_dtoq qp_feq qp_fge \
 	qp_fgt qp_fle qp_flt qp_fne qp_itoq qp_mul qp_neg qp_qtod qp_qtoi   \
diff --git a/sysdeps/sparc/sparc64/atomic-machine.h b/sysdeps/sparc/sparc64/atomic-machine.h
deleted file mode 100644
index 4e1fc04eca..0000000000
--- a/sysdeps/sparc/sparc64/atomic-machine.h
+++ /dev/null
@@ -1,129 +0,0 @@
-/* Atomic operations.  sparc64 version.
-   Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <stdint.h>
-
-typedef int8_t atomic8_t;
-typedef uint8_t uatomic8_t;
-typedef int_fast8_t atomic_fast8_t;
-typedef uint_fast8_t uatomic_fast8_t;
-
-typedef int16_t atomic16_t;
-typedef uint16_t uatomic16_t;
-typedef int_fast16_t atomic_fast16_t;
-typedef uint_fast16_t uatomic_fast16_t;
-
-typedef int32_t atomic32_t;
-typedef uint32_t uatomic32_t;
-typedef int_fast32_t atomic_fast32_t;
-typedef uint_fast32_t uatomic_fast32_t;
-
-typedef int64_t atomic64_t;
-typedef uint64_t uatomic64_t;
-typedef int_fast64_t atomic_fast64_t;
-typedef uint_fast64_t uatomic_fast64_t;
-
-typedef intptr_t atomicptr_t;
-typedef uintptr_t uatomicptr_t;
-typedef intmax_t atomic_max_t;
-typedef uintmax_t uatomic_max_t;
-
-#define __HAVE_64B_ATOMICS 1
-#define USE_ATOMIC_COMPILER_BUILTINS 0
-
-/* XXX Is this actually correct?  */
-#define ATOMIC_EXCHANGE_USES_CAS 1
-
-
-#define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
-  (abort (), (__typeof (*mem)) 0)
-
-#define __arch_compare_and_exchange_val_16_acq(mem, newval, oldval) \
-  (abort (), (__typeof (*mem)) 0)
-
-#define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \
-({									      \
-  __typeof (*(mem)) __acev_tmp;						      \
-  __typeof (mem) __acev_mem = (mem);					      \
-  if (__builtin_constant_p (oldval) && (oldval) == 0)			      \
-    __asm __volatile ("cas [%3], %%g0, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "m" (*__acev_mem), "r" (__acev_mem),		      \
-		        "0" (newval) : "memory");			      \
-  else									      \
-    __asm __volatile ("cas [%4], %2, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "r" (oldval), "m" (*__acev_mem), "r" (__acev_mem),    \
-		        "0" (newval) : "memory");			      \
-  __acev_tmp; })
-
-#define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \
-({									      \
-  __typeof (*(mem)) __acev_tmp;						      \
-  __typeof (mem) __acev_mem = (mem);					      \
-  if (__builtin_constant_p (oldval) && (oldval) == 0)			      \
-    __asm __volatile ("casx [%3], %%g0, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "m" (*__acev_mem), "r" (__acev_mem),		      \
-		        "0" ((long) (newval)) : "memory");		      \
-  else									      \
-    __asm __volatile ("casx [%4], %2, %0"				      \
-		      : "=r" (__acev_tmp), "=m" (*__acev_mem)		      \
-		      : "r" ((long) (oldval)), "m" (*__acev_mem),	      \
-		        "r" (__acev_mem), "0" ((long) (newval)) : "memory");  \
-  __acev_tmp; })
-
-#define atomic_exchange_acq(mem, newvalue) \
-  ({ __typeof (*(mem)) __oldval, __val;					      \
-     __typeof (mem) __memp = (mem);					      \
-     __typeof (*(mem)) __value = (newvalue);				      \
-									      \
-     if (sizeof (*(mem)) == 4)						      \
-       __asm ("swap %0, %1"						      \
-	      : "=m" (*__memp), "=r" (__oldval)				      \
-	      : "m" (*__memp), "1" (__value) : "memory");		      \
-     else								      \
-       {								      \
-	 __val = *__memp;						      \
-	 do								      \
-	   {								      \
-	     __oldval = __val;						      \
-	     __val = atomic_compare_and_exchange_val_acq (__memp, __value,    \
-							  __oldval);	      \
-	   }								      \
-         while (__builtin_expect (__val != __oldval, 0));		      \
-       }								      \
-     __oldval; })
-
-#define atomic_compare_and_exchange_val_24_acq(mem, newval, oldval) \
-  atomic_compare_and_exchange_val_acq (mem, newval, oldval)
-
-#define atomic_exchange_24_rel(mem, newval) \
-  atomic_exchange_rel (mem, newval)
-
-#define atomic_full_barrier() \
-  __asm __volatile ("membar #LoadLoad | #LoadStore"			      \
-			  " | #StoreLoad | #StoreStore" : : : "memory")
-#define atomic_read_barrier() \
-  __asm __volatile ("membar #LoadLoad | #LoadStore" : : : "memory")
-#define atomic_write_barrier() \
-  __asm __volatile ("membar #LoadStore | #StoreStore" : : : "memory")
-
-extern void __cpu_relax (void);
-#define atomic_spin_nop() __cpu_relax ()
diff --git a/sysdeps/unix/sysv/linux/sparc/lowlevellock.h b/sysdeps/unix/sysv/linux/sparc/lowlevellock.h
deleted file mode 100644
index 6c005e8ffa..0000000000
--- a/sysdeps/unix/sysv/linux/sparc/lowlevellock.h
+++ /dev/null
@@ -1,130 +0,0 @@
-/* Copyright (C) 2003-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Jakub Jelinek <jakub@redhat.com>, 2003.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#ifndef _LOWLEVELLOCK_H
-#define _LOWLEVELLOCK_H	1
-
-#include <time.h>
-#include <sys/param.h>
-#include <bits/pthreadtypes.h>
-#include <atomic.h>
-#include <kernel-features.h>
-#include <errno.h>
-
-#include <lowlevellock-futex.h>
-
-static inline int
-__attribute__ ((always_inline))
-__lll_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_24_acq (futex, 1, 0) != 0;
-}
-#define lll_trylock(futex) __lll_trylock (&(futex))
-
-static inline int
-__attribute__ ((always_inline))
-__lll_cond_trylock (int *futex)
-{
-  return atomic_compare_and_exchange_val_24_acq (futex, 2, 0) != 0;
-}
-#define lll_cond_trylock(futex) __lll_cond_trylock (&(futex))
-
-
-extern void __lll_lock_wait_private (int *futex) attribute_hidden;
-extern void __lll_lock_wait (int *futex, int private) attribute_hidden;
-
-static inline void
-__attribute__ ((always_inline))
-__lll_lock (int *futex, int private)
-{
-  int val = atomic_compare_and_exchange_val_24_acq (futex, 1, 0);
-
-  if (__glibc_unlikely (val != 0))
-    {
-      if (__builtin_constant_p (private) && private == LLL_PRIVATE)
-	__lll_lock_wait_private (futex);
-      else
-	__lll_lock_wait (futex, private);
-    }
-}
-#define lll_lock(futex, private) __lll_lock (&(futex), private)
-
-static inline void
-__attribute__ ((always_inline))
-__lll_cond_lock (int *futex, int private)
-{
-  int val = atomic_compare_and_exchange_val_24_acq (futex, 2, 0);
-
-  if (__glibc_unlikely (val != 0))
-    __lll_lock_wait (futex, private);
-}
-#define lll_cond_lock(futex, private) __lll_cond_lock (&(futex), private)
-
-
-extern int __lll_clocklock_wait (int *futex, clockid_t, int val,
-				 const struct timespec *,
-				 int private) attribute_hidden;
-
-#define lll_timedwait(futex, val, clockid, abstime, private)		\
-  __lll_clocklock_wait (futex, val, clockid, abstime, private)
-
-static inline int
-__attribute__ ((always_inline))
-__lll_clocklock (int *futex, clockid_t clockid,
-                 const struct timespec *abstime, int private)
-{
-  int val = atomic_compare_and_exchange_val_24_acq (futex, 1, 0);
-  int result = 0;
-
-  if (__glibc_unlikely (val != 0))
-    {
-      do
-	{
-	  int oldval = atomic_compare_and_exchange_val_24_acq (futex, val, 1);
-	  if (oldval != 0)
-	    {
-	      result = __lll_clocklock_wait (futex, 2, clockid, abstime,
-					     private);
-	      if (result == EINVAL || result == ETIMEDOUT)
-		break;
-	    }
-        }
-      while (atomic_compare_and_exchange_val_24_acq (futex, val, 0) != 0);
-    }
-  return result;
-}
-#define lll_clocklock(futex, clockid, abstime, private)  \
-  __lll_clocklock (&(futex), clockid, abstime, private)
-
-#define lll_unlock(lock, private) \
-  ((void) ({								      \
-    int *__futex = &(lock);						      \
-    int __private = (private);						      \
-    int __val = atomic_exchange_24_rel (__futex, 0);			      \
-    if (__glibc_unlikely (__val > 1))					      \
-      lll_futex_wake (__futex, 1, __private);				      \
-  }))
-
-#define lll_islocked(futex) \
-  (futex != 0)
-
-/* Initializers for lock.  */
-#define LLL_LOCK_INITIALIZER		(0)
-#define LLL_LOCK_INITIALIZER_LOCKED	(1)
-
-#endif	/* lowlevellock.h */