From patchwork Mon Sep 18 11:26:45 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 112912 Delivered-To: patch@linaro.org Received: by 10.140.106.117 with SMTP id d108csp3547150qgf; Mon, 18 Sep 2017 04:27:12 -0700 (PDT) X-Received: by 10.99.121.77 with SMTP id u74mr31993891pgc.254.1505734032832; Mon, 18 Sep 2017 04:27:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1505734032; cv=none; d=google.com; s=arc-20160816; b=l4SN22kZWN6RAaNZ86DSRtgsGdbLULDULiMwyObKI7tXUwE1Eh18sW8TxT+9gXkgBg EXDMgAnVnk6c/F+CdXENVFKPYTDmMddXrHwYV9+abhKl4CIW2m2HKHjDlUT6xroJUYPQ +HXKQABOVU5DM9MI3qNoH9GookQsZOpkT9H9vdboQBY2vpb0IJd6eLidmaSuMZbvELHG ZEg97F/uBeps1jtNUypFY3Rws/xfrzPYOYoh0VF8iokTBC2KXZRwvkDbD8gk5+cyF2He T6Bq2478J2HrQQV2Y4hHeCAHDjuOdIMfdQwHiJg8bg0ED5FL39y1yvv+K9s36H6GX9Aa Eilw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:user-agent:message-id:date:subject:mail-followup-to:to :from:delivered-to:sender:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mailing-list:dkim-signature :domainkey-signature:arc-authentication-results; bh=n0Xn+qp2AImKAf/f32VZqgYiBes7Wve/6yzuZ5QQRWo=; b=fwg9h/NBTScClyBjDgTsI4NsXqZZjbyFGEXGMKeH23VMJAyC/7OsYWiP1gQdPDe2h4 Z+sC1TvWXgReIFODQNvkpt59ZSzcfsRee3YQ8WzcnGbZgzwwZ4GE+DgBqODVoVV9RAWk JFpYmovJznDeN9pND/RWb7oiSDvEt/qlxPKUaYbzoPC/09qzfO5J8d//me5WnXy7z+lI T3TCmhVi/L+QO7bw2KsuSdIwSzCmPI0uVP0lccM26IMiATrl9h+ZDCzVMHbnnW7euIO0 XiOZCTJmefVL1/QMhYzqs0mlkesrHrl53+kGEJHCO/BbmP9RIxcyiLbSgKqosaK8sTGS mVxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=MEgDzpRD; spf=pass (google.com: domain of gcc-patches-return-462390-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462390-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id y123si2257103pgy.24.2017.09.18.04.27.12 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 Sep 2017 04:27:12 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-462390-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=MEgDzpRD; spf=pass (google.com: domain of gcc-patches-return-462390-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-462390-patch=linaro.org@gcc.gnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; q=dns; s= default; b=rnUaNmXf/HtWU/B67AcgpV+D0GLpTuuV0IGX9j/0sZn11TcgWgWbt t2e4fRLRjg1V4r+HYltwZ1uyXCOGLkV+0L/KkIh42dPQDYi8SI7q1E4JbYQ+2pic JyiRReM0iPnzxJfxz3E7V0VJ6CPjqW1GmZ6kF4cDcvWX8Wm7+5j7bI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:mime-version:content-type; s= default; bh=/6WWMOCLS9q6tfYsECrykS8bF7A=; b=MEgDzpRDYu8WTjqzm0lG /Z0jwms/KXp8suIELBe3FTJpiHNkJCdWyZdorI3rBfpdal9ZuevwQ+O6LMXCks0U 8sVnWcbEUGSJcpMXep7/3sykj7x5e8SXUqkbpmzzox+7gkSAOauHBCEIjcVI4oI6 Jyhr4k9Lijc8aJcqPF2I66A= Received: (qmail 32524 invoked by alias); 18 Sep 2017 11:26:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 32513 invoked by uid 89); 18 Sep 2017 11:26:58 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=culprit, Side, sk:individ, Effects X-HELO: mail-wr0-f175.google.com Received: from mail-wr0-f175.google.com (HELO mail-wr0-f175.google.com) (209.85.128.175) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Sep 2017 11:26:54 +0000 Received: by mail-wr0-f175.google.com with SMTP id c23so177799wrg.9 for ; Mon, 18 Sep 2017 04:26:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:mail-followup-to:subject:date:message-id :user-agent:mime-version; bh=n0Xn+qp2AImKAf/f32VZqgYiBes7Wve/6yzuZ5QQRWo=; b=X76xI28STDzJbjxyzhmpSOW5iI+bEhvhTUKW7R6uJe9zYcCdGT6FlFSGs8mAyrL/5q sco6uxuyTbFZQ26amyPOiTrnCoK8Kag+BYOqVibPGE/vHwSMhSrWj13NMyHIENh5aeR8 4BUhBUEO0vvNrB5J70s+Q4pILhyUOBYB7pY7AMTGjrxe+aJyDWxkkCMJySW3JMCv63t7 PbnQ9ZA1xaL254fcfxk4Y2jlegUF0unMQMIyBe4+uQkM3c0Zu1cREP1wwJ9zpbfBMP3w PDgljwA5seujUNTHYC1grZe/QRlyx/AYRh/2gXoKtXLg/MDhFUffbbqg48tQ/E7qKeHc P12g== X-Gm-Message-State: AHPjjUgSTF7GEr8LYDYFHtdd/vsGQnqFNCUT18vCeJ+JCjuERokv1OWx NfiqHITlR5PL/QNgdZidUA== X-Google-Smtp-Source: ADKCNb4sgQIUNKpdcjpAhTC/Ba5ATME2TyZo7zoadXUC5VUygYY3pmaaEQlWLGiTzXxTXaOStQhADQ== X-Received: by 10.223.151.139 with SMTP id s11mr30193986wrb.237.1505734011781; Mon, 18 Sep 2017 04:26:51 -0700 (PDT) Received: from localhost (92.40.248.255.threembb.co.uk. [92.40.248.255]) by smtp.gmail.com with ESMTPSA id j4sm6577660wrg.96.2017.09.18.04.26.50 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Sep 2017 04:26:50 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@linaro.org Subject: Base subreg rules on REGMODE_NATURAL_SIZE rather than UNITS_PER_WORD Date: Mon, 18 Sep 2017 12:26:45 +0100 Message-ID: <87fubkgscq.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Originally subregs operated at the word level and subreg offsets were measured in words. The offset units were later changed from words to bytes (SUBREG_WORD became SUBREG_BYTE), but the fundamental assumption that subregs should operate at the word level remained. Whether (subreg:M1 (reg:M2 R2) N) is well-formed depended on the way that M1 and M2 partitioned into words and whether the subword part of N represented a lowpart. However, some questions depended instead on the macro REGMODE_NATURAL_SIZE, which was introduced as part of the patch that moved from SUBREG_WORD to SUBREG_BYTE. It is used to decide whether setting (subreg:M1 (reg:M2 R2) N) clobbers all of R2 or just part of it (df_read_modify_subreg). Using words doesn't really make sense for modern vector architectures. Vector registers are usually bigger than a word and: (a) setting the scalar lowpart of them usually clobbers the rest of the register (contrary to the subreg rules, where only the containing words should be clobbered). (b) high words of vector registers are often not independently addressable, even though that's what the subreg rules expect. This patch therefore uses REGMODE_NATURAL_SIZE instead of UNITS_PER_WORD to determine the size of the independently addressable blocks in an inner register. This is needed for SVE because the number of words in a vector mode isn't known at compile time, so isn't a sensible basis for calculating the number of registers. The only existing port to define REGMODE_NATURAL_SIZE is 64-bit SPARC, where FP registers are 32 bits. (This is the opposite of the use case for SVE, since the natural division is smaller than a word.) I compiled the testsuite before and after the patch for sparc64-linux-gnu and the only test whose assembly changed was g++.dg/debug/pr65678.C, where the order of two independent stores was reversed and where a different register was picked for one pseudo. The new code was otherwise equivalent to the old code. Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu. Also tested by comparing the testsuite assembly output on at least one target per CPU directory, with only the SPARC differences just mentioned. OK to install? Richard 2017-09-18 Richard Sandiford Alan Hayward David Sherwood gcc/ * doc/rtl.texi: Rewrite the subreg rules so that they partition the inner register into REGMODE_NATURAL_SIZE bytes rather than UNITS_PER_WORD bytes. * emit-rtl.c (validate_subreg): Divide subregs into blocks based on REGMODE_NATURAL_SIZE of the inner mode. (gen_lowpart_common): Split the SCALAR_FLOAT_MODE_P and !SCALAR_FLOAT_MODE_P cases. Use REGMODE_NATURAL_SIZE for the latter. * expr.c (store_constructor): Use REGMODE_NATURAL_SIZE to test whether something is likely to occupy more than one register. Index: gcc/doc/rtl.texi =================================================================== --- gcc/doc/rtl.texi 2017-09-15 13:56:20.294149114 +0100 +++ gcc/doc/rtl.texi 2017-09-18 12:24:20.287485854 +0100 @@ -1921,19 +1921,32 @@ false. When @var{m1} is at least as narrow as @var{m2} the @code{subreg} expression is called @dfn{normal}. +@findex REGMODE_NATURAL_SIZE Normal @code{subreg}s restrict consideration to certain bits of -@var{reg}. There are two cases. If @var{m1} is smaller than a word, -the @code{subreg} refers to the least-significant part (or -@dfn{lowpart}) of one word of @var{reg}. If @var{m1} is word-sized or -greater, the @code{subreg} refers to one or more complete words. - -When used as an lvalue, @code{subreg} is a word-based accessor. -Storing to a @code{subreg} modifies all the words of @var{reg} that -overlap the @code{subreg}, but it leaves the other words of @var{reg} +@var{reg}. For this purpose, @var{reg} is divided into +individually-addressable blocks in which each block has: + +@smallexample +REGMODE_NATURAL_SIZE (@var{m2}) +@end smallexample + +bytes. Usually the value is @code{UNITS_PER_WORD}; that is, +most targets usually treat each word of a register as being +independently addressable. + +There are two types of normal @code{subreg}. If @var{m1} is known +to be no bigger than a block, the @code{subreg} refers to the +least-significant part (or @dfn{lowpart}) of one block of @var{reg}. +If @var{m1} is known to be larger than a block, the @code{subreg} refers +to two or more complete blocks. + +When used as an lvalue, @code{subreg} is a block-based accessor. +Storing to a @code{subreg} modifies all the blocks of @var{reg} that +overlap the @code{subreg}, but it leaves the other blocks of @var{reg} alone. -When storing to a normal @code{subreg} that is smaller than a word, -the other bits of the referenced word are usually left in an undefined +When storing to a normal @code{subreg} that is smaller than a block, +the other bits of the referenced block are usually left in an undefined state. This laxity makes it easier to generate efficient code for such instructions. To represent an instruction that preserves all the bits outside of those in the @code{subreg}, use @code{strict_low_part} @@ -1991,10 +2004,11 @@ number of undefined bits. For example: (subreg:PSI (reg:SI 0) 0) @end smallexample +@findex REGMODE_NATURAL_SIZE accesses the whole of @samp{(reg:SI 0)}, but the exact relationship between the @code{PSImode} value and the @code{SImode} value is not -defined. If we assume @samp{UNITS_PER_WORD <= 4}, then the following -two @code{subreg}s: +defined. If we assume @samp{REGMODE_NATURAL_SIZE (DImode) <= 4}, +then the following two @code{subreg}s: @smallexample (subreg:PSI (reg:DI 0) 0) @@ -2005,7 +2019,7 @@ represent independent 4-byte accesses to @samp{(reg:DI 0)}. Both @code{subreg}s have an unknown number of undefined bits. -If @samp{UNITS_PER_WORD <= 2} then these two @code{subreg}s: +If @samp{REGMODE_NATURAL_SIZE (PSImode) <= 2} then these two @code{subreg}s: @smallexample (subreg:HI (reg:PSI 0) 0) @@ -2874,7 +2888,7 @@ The presence of @code{strict_low_part} s register which is meaningful in mode @var{n}, but is not part of mode @var{m}, is not to be altered. Normally, an assignment to such a subreg is allowed to have undefined effects on the rest of the -register when @var{m} is less than a word. +register when @var{m} is smaller than @samp{REGMODE_NATURAL_SIZE (@var{n})}. @end table @node Side Effects Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c 2017-09-15 13:56:20.296149147 +0100 +++ gcc/emit-rtl.c 2017-09-18 12:24:20.289496494 +0100 @@ -816,6 +816,8 @@ validate_subreg (machine_mode omode, mac if (offset >= isize) return false; + unsigned int regsize = REGMODE_NATURAL_SIZE (imode); + /* ??? This should not be here. Temporarily continue to allow word_mode subregs of anything. The most common offender is (subreg:SI (reg:DF)). Generally, backends are doing something sketchy but it'll take time to @@ -824,7 +826,7 @@ validate_subreg (machine_mode omode, mac ; /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though store_bit_field is the culprit here, and not the backends. */ - else if (osize >= UNITS_PER_WORD && isize >= osize) + else if (osize >= regsize && isize >= osize) ; /* Allow component subregs of complex and vector. Though given the below extraction rules, it's not always clear what that means. */ @@ -876,17 +878,23 @@ validate_subreg (machine_mode omode, mac } /* For pseudo registers, we want most of the same checks. Namely: - If the register no larger than a word, the subreg must be lowpart. - If the register is larger than a word, the subreg must be the lowpart - of a subword. A subreg does *not* perform arbitrary bit extraction. - Given that we've already checked mode/offset alignment, we only have - to check subword subregs here. */ - if (osize < UNITS_PER_WORD + + Assume that the pseudo register will be allocated to hard registers + that can hold REGSIZE bytes each. If OSIZE is not a multiple of REGSIZE, + the remainder must correspond to the lowpart of the containing hard + register. If BYTES_BIG_ENDIAN, the lowpart is at the highest offset, + otherwise it is at the lowest offset. + + Given that we've already checked the mode and offset alignment, + we only have to check subblock subregs here. */ + if (osize < regsize && ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode)))) { - machine_mode wmode = isize > UNITS_PER_WORD ? word_mode : imode; - unsigned int low_off = subreg_lowpart_offset (omode, wmode); - if (offset % UNITS_PER_WORD != low_off) + unsigned int block_size = MIN (isize, regsize); + unsigned int offset_within_block = offset % block_size; + if (BYTES_BIG_ENDIAN + ? offset_within_block != block_size - osize + : offset_within_block != 0) return false; } return true; @@ -1439,14 +1447,21 @@ gen_lowpart_common (machine_mode mode, r if (innermode == mode) return x; - /* MODE must occupy no more words than the mode of X. */ - if ((msize + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD - > ((xsize + (UNITS_PER_WORD - 1)) / UNITS_PER_WORD)) - return 0; - - /* Don't allow generating paradoxical FLOAT_MODE subregs. */ - if (SCALAR_FLOAT_MODE_P (mode) && msize > xsize) - return 0; + if (SCALAR_FLOAT_MODE_P (mode)) + { + /* Don't allow paradoxical FLOAT_MODE subregs. */ + if (msize > xsize) + return 0; + } + else + { + /* MODE must occupy no more of the underlying registers than X. */ + unsigned int regsize = REGMODE_NATURAL_SIZE (innermode); + unsigned int mregs = CEIL (msize, regsize); + unsigned int xregs = CEIL (xsize, regsize); + if (mregs > xregs) + return 0; + } scalar_int_mode int_mode, int_innermode, from_mode; if ((GET_CODE (x) == ZERO_EXTEND || GET_CODE (x) == SIGN_EXTEND) Index: gcc/expr.c =================================================================== --- gcc/expr.c 2017-09-14 16:25:43.862558497 +0100 +++ gcc/expr.c 2017-09-18 12:24:20.292512455 +0100 @@ -6190,7 +6190,8 @@ store_constructor (tree exp, rtx target, a constant. But if more than one register is involved, this probably loses. */ else if (REG_P (target) && TREE_STATIC (exp) - && GET_MODE_SIZE (GET_MODE (target)) <= UNITS_PER_WORD) + && (GET_MODE_SIZE (GET_MODE (target)) + <= REGMODE_NATURAL_SIZE (GET_MODE (target)))) { emit_move_insn (target, CONST0_RTX (GET_MODE (target))); cleared = 1;