From patchwork Wed Jan 11 20:45:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641261 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3457865pvb; Wed, 11 Jan 2023 12:46:20 -0800 (PST) X-Google-Smtp-Source: AMrXdXtM+1g/C+XnQx2nyz/4Sf+4CcyQ+ZIG6tBysHFFOpfod43LAsXxZOsH8Un3wba9+VfKspq1 X-Received: by 2002:a17:907:c019:b0:7c0:b569:8efe with SMTP id ss25-20020a170907c01900b007c0b5698efemr59210091ejc.60.1673469980370; Wed, 11 Jan 2023 12:46:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673469980; cv=none; d=google.com; s=arc-20160816; b=dqSd1vMj8/LQK014XZrtkO4QZILIEG2SjoShORBDJ3WjPoKMTFyMPOfCLQ6K+/aTrj lfHL1OKuBKqPPxoGPkmWYHvqobdhxo64v5WN31Ekf5kHL7mng22WqYPOSMdko4Tbv8Nj pSEJjmi99Qr4EbYgj9/V870RzNnAr/TyL6gIv/IaSZhoj3Hn5EOM29gCHkF5uwfLhpUp T/JDxzvol2zQssxHCE8X0BquP0NdeQudqvsiWBxXG+G+qwhsGcwXFyk6SgwE9gxv82/R r9n5guTITCZ7nqJ+vcG1tugujMABQ19wNQlAqKsO7py+XGUHzPHY8c2YfJkbTB8sYmtK rUzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=p/ljwBtuCARrOKmgc9T6GFTlEl0KQ/8ti25LQF6aFf4=; b=XxEVx4pZqH52QwXeXq6w9V+iq/ridLr7Jn0YSBuG61Wwx09YoGh1vSp+s3Aj3cPebo lvfcFxzHWIrbI3cdA7IRcTRYIGlhkzpbuyiYdcS8Q1qTcz7zRSuXfDBxhxfbG97MSXmH ofYdntl12YcY1sBZnbzu0lRRpOUVryIq781cCGAwscftiOPrTQ5K14340UkPIElSD0nh 75EJTX5aN+gdw1csolIW7Zg9+7V1Ynfvwic2OPsRGHGOwAOIWCCnt/YoufjVqyZ/o9QJ 1stXRdroBfwdWKlG3z9CJY9ZP5friTdNGiudqAKvPtaiinYhOIV7P5QBBxWsqFo9kZT/ SkOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=eVlQjdCU; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id du1-20020a17090772c100b0080f0493b9c8si17427599ejc.536.2023.01.11.12.46.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:20 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=eVlQjdCU; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3A1D7385B538 for ; Wed, 11 Jan 2023 20:46:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3A1D7385B538 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673469979; bh=p/ljwBtuCARrOKmgc9T6GFTlEl0KQ/8ti25LQF6aFf4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=eVlQjdCU7JdlkArXe6UTDhinRdIeVL2VLUloMN433beieylS+j9viAGKHNYbUfQWg yxeFAKcwQU5PEmUcu6iUJJRO5lJsZEqy5ty5unFIp0UZk9lGbbAuzKXevRs7yLV+P0 K/JFaGNDGro7psTtrgCr/yJu0N/RN8a5VZprDlLs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 860193858C66 for ; Wed, 11 Jan 2023 20:46:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 860193858C66 Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-150b06cb1aeso16806001fac.11 for ; Wed, 11 Jan 2023 12:46:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p/ljwBtuCARrOKmgc9T6GFTlEl0KQ/8ti25LQF6aFf4=; b=SMUAT1lxBgcsD9vhyQOEAKwBnpAt/iao019m3wUxNkShKOVedjrh+RsGpYKbzMVRxT PbPhs0nZ2F3h9hZkDQYuephGChuVkx3VhgQ0HZI/kdS6YOtfnPU2hcG7oX1cijLJ0pQG iUzSEXmYmqqWi/WltcXhY4ehDqjo4IJM0FRQzI0JIQhxykHuIyGjgQmC+Ifk2bDd2yNE c51P7NP9BcSvQVo2Q6BJC+qeB+XU60dWEvU7a/59H7S0HV0TuC4QCo0tsafwn4n801B6 eecwUm3Dyt4LmLFRqZU8vz8wnYlHNNV0AULxJLpYeYIOQdLXTlfuhnCa+DcWLfqtywHL sXmg== X-Gm-Message-State: AFqh2kqjV0xrucq0z2oN8kHw7sDtaOrgtbjiA5zuFcvG8u44d5gwv3ST tf+FdbAPrXNCjR+ecXqpxGNxC97jXDycW1trGNU= X-Received: by 2002:a05:6870:7805:b0:13b:6696:578a with SMTP id hb5-20020a056870780500b0013b6696578amr40929316oab.53.1673469966136; Wed, 11 Jan 2023 12:46:06 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:05 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 01/17] Parameterize op_t from memcopy.h Date: Wed, 11 Jan 2023 17:45:42 -0300 Message-Id: <20230111204558.2402155-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto It moves the op_t definition out to an specific header, adds the attribute 'may-alias', and cleanup its duplicated definitions. Checked with a build and check with run-built-tests=no for all major Linux ABIs. --- string/memcmp.c | 1 - sysdeps/generic/memcopy.h | 6 ++---- sysdeps/generic/string-optype.h | 24 ++++++++++++++++++++++++ sysdeps/x86_64/x32/string-optype.h | 24 ++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 5 deletions(-) create mode 100644 sysdeps/generic/string-optype.h create mode 100644 sysdeps/x86_64/x32/string-optype.h diff --git a/string/memcmp.c b/string/memcmp.c index 067b2e6a42..ea0fa03e1c 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -46,7 +46,6 @@ /* Type to use for aligned memory operations. This should normally be the biggest type supported by a single load and store. Must be an unsigned type. */ -# define op_t unsigned long int # define OPSIZ (sizeof (op_t)) /* Threshold value for when to enter the unrolled loops. */ diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index 9f3ffb5d30..b5ffa4d114 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -55,10 +55,8 @@ [I fail to understand. I feel stupid. --roland] */ -/* Type to use for aligned memory operations. - This should normally be the biggest type supported by a single load - and store. */ -#define op_t unsigned long int +/* Type to use for aligned memory operations. */ +#include #define OPSIZ (sizeof (op_t)) /* Type to use for unaligned operations. */ diff --git a/sysdeps/generic/string-optype.h b/sysdeps/generic/string-optype.h new file mode 100644 index 0000000000..42bdd2a145 --- /dev/null +++ b/sysdeps/generic/string-optype.h @@ -0,0 +1,24 @@ +/* Define a type to use for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTYPE_H +#define _STRING_OPTYPE_H 1 + +typedef unsigned long int __attribute__ ((__may_alias__)) op_t; + +#endif /* string-optype.h */ diff --git a/sysdeps/x86_64/x32/string-optype.h b/sysdeps/x86_64/x32/string-optype.h new file mode 100644 index 0000000000..e7679f934f --- /dev/null +++ b/sysdeps/x86_64/x32/string-optype.h @@ -0,0 +1,24 @@ +/* Define a type to use for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTYPE_H +#define _STRING_OPTYPE_H 1 + +typedef unsigned long long int __attribute__ ((__may_alias__)) op_t; + +#endif /* string-optype.h */ From patchwork Wed Jan 11 20:45:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641262 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3457894pvb; Wed, 11 Jan 2023 12:46:27 -0800 (PST) X-Google-Smtp-Source: AMrXdXvrUtauHGU7gOv2IlRHe/vAtDBnfB9O/WNskW8uupZLLGk6tqnC8bOuY2lRb3YUioPKxPft X-Received: by 2002:a17:907:7da7:b0:861:7cb0:657 with SMTP id oz39-20020a1709077da700b008617cb00657mr2716124ejc.76.1673469987480; Wed, 11 Jan 2023 12:46:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673469987; cv=none; d=google.com; s=arc-20160816; b=cnRg4P/ZnTCAq0ZxuT+DxBT2vGLW2uePv8o5EgS1hNfuHGh1FhvGeEDOJXl90PgZFs w2s8ZbPjAxN+T4mBXl+4SJ8CBS1Bxjo2XPyNzokUrJivVb7kjN53juqFcJ8sK6QaGT5e ud5XhuJpTPugbETW18CmxdNAOZ8WCHm17ky8gz/b8NMmFcr6IvmUMwhrAxAG7xADjU3E QFf9lJhp877DM5DUrxkfKd++qCGrIzpGznm8XJxbfx/mQaQDMM92lvruCRF6C72/Xve8 E+hin4UylaZH+D3sTwlYc73b0pRYvy75bBczE2a4N6sHiyP7WaTzndjEe6FzFavy4Anu 8XpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; b=cKCdonejeijrRvtxssev/br/VntTt4V6to/3tv0EYEQuP9jpaXHwsbCclG6AB9efCY 0LJpM2RfKoYH1FyV2ZZ1WDn5wkJvv8ymOB5i0pOBJuB9DNbssvRnKLBeLxLs3h2xuqd1 AoBwp/G4iV82eLaoK7NrbSQpA28eJRGzxhfs8hRiRj5lG7fqCGXr+HBe0TCV1my9xCmP 58XsOL2JRF7YEThMteVGCsg9rrciHFbw68Xmr8J3wuqNezqjs6cHBXGkmkwrL9qyuivZ /szze3z05oJJYi/jKpo3xeXnULnb2B02RykfMapYI22WQJ+G3I8K7lbcR5R81WrRfLyG 4Jqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ezLhYwz3; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id sc19-20020a1709078a1300b007ce3aef9498si17143909ejc.628.2023.01.11.12.46.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:27 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ezLhYwz3; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 50A3A385B527 for ; Wed, 11 Jan 2023 20:46:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 50A3A385B527 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673469986; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ezLhYwz39AuPzfmfFyYTUh8LeXnwuw4/BkKw6dN1PSIVp1c7t2xC+++ok9mXAFQhZ KAxjmOJzZg1u0mpMMWNPWxFKIqKub25aLXgZP3KLorM/6RUgkumvUsrGfRTg2PBR6d C/r5PDtt9BUcQM+UWn48F3per30Beuw6wgKAgqTk= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 542213858C00 for ; Wed, 11 Jan 2023 20:46:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 542213858C00 Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-150debe2b7cso16916323fac.0 for ; Wed, 11 Jan 2023 12:46:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=K5Ag4woQTmBilI0nqI0wTKxlSKEup7s5F2pLMCC1uTo=; b=4Bm4C10X8K8YiuCo4CKV0NkOL6yQosHktPQQD+zFDZP4tFcOgremhPIaXQEa7mREBY a7ewzMqOyC5VgMqFWghc0slWodcTnYnL0DRuTRj5HQxHQF3PpIdptu+yNDvkLc+IBb0H yp7msRsNHuCgGu2RN7O9/GscOSoqDB6hJt8mQZfe35WM5aeUWIe7PqLYfPCCIl4w7iq9 dAZRBsKMyn2ZxdnlkKioW88cnvyWVoQT7J+xSgQk4CyivkXDXV8NkBOWLg16cickUjtm gFq6OHaSGW59d9E0DjC8RF9/vovskzbzZ/nF/k+8LeiGWFKRaOgWxxHYvUdZeTeMcOuw 5AKg== X-Gm-Message-State: AFqh2kqZiYvmfDRSLAYbJYLkWOzCykozNLAftbkO9twdu1aCCPlQOBPR opQ13JhIpgCf7eAko4Uxl6hiHbqtFyYA+XiJQNc= X-Received: by 2002:a05:6870:6713:b0:157:7996:fddd with SMTP id gb19-20020a056870671300b001577996fdddmr7966152oab.0.1673469968071; Wed, 11 Jan 2023 12:46:08 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:07 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Carlos O'Donell Subject: [PATCH v7 02/17] Parameterize OP_T_THRES from memcopy.h Date: Wed, 11 Jan 2023 17:45:43 -0300 Message-Id: <20230111204558.2402155-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson It moves OP_T_THRES out of memcopy.h to its own header and adjust each architecture that redefines it. Checked with a build and check with run-built-tests=no for all major Linux ABIs. Co-authored-by: Adhemerval Zanella Reviewed-by: Carlos O'Donell --- string/memcmp.c | 3 --- sysdeps/generic/memcopy.h | 4 +--- sysdeps/generic/string-opthr.h | 25 ++++++++++++++++++++++ sysdeps/i386/memcopy.h | 3 --- sysdeps/i386/string-opthr.h | 25 ++++++++++++++++++++++ sysdeps/m68k/memcopy.h | 3 --- sysdeps/powerpc/powerpc32/power4/memcopy.h | 5 ----- 7 files changed, 51 insertions(+), 17 deletions(-) create mode 100644 sysdeps/generic/string-opthr.h create mode 100644 sysdeps/i386/string-opthr.h diff --git a/string/memcmp.c b/string/memcmp.c index ea0fa03e1c..047ca4f98e 100644 --- a/string/memcmp.c +++ b/string/memcmp.c @@ -48,9 +48,6 @@ and store. Must be an unsigned type. */ # define OPSIZ (sizeof (op_t)) -/* Threshold value for when to enter the unrolled loops. */ -# define OP_T_THRES 16 - /* Type to use for unaligned operations. */ typedef unsigned char byte; diff --git a/sysdeps/generic/memcopy.h b/sysdeps/generic/memcopy.h index b5ffa4d114..e9b3f227b2 100644 --- a/sysdeps/generic/memcopy.h +++ b/sysdeps/generic/memcopy.h @@ -57,6 +57,7 @@ /* Type to use for aligned memory operations. */ #include +#include #define OPSIZ (sizeof (op_t)) /* Type to use for unaligned operations. */ @@ -188,9 +189,6 @@ extern void _wordcopy_bwd_dest_aligned (long int, long int, size_t) #endif -/* Threshold value for when to enter the unrolled loops. */ -#define OP_T_THRES 16 - /* Set to 1 if memcpy is safe to use for forward-copying memmove with overlapping addresses. This is 0 by default because memcpy implementations are generally not safe for overlapping addresses. */ diff --git a/sysdeps/generic/string-opthr.h b/sysdeps/generic/string-opthr.h new file mode 100644 index 0000000000..6f10a98edd --- /dev/null +++ b/sysdeps/generic/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. Generic version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_OPTHR_H +#define _STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 16 + +#endif /* string-opthr.h */ diff --git a/sysdeps/i386/memcopy.h b/sysdeps/i386/memcopy.h index 4f82689b84..1aa7c3a850 100644 --- a/sysdeps/i386/memcopy.h +++ b/sysdeps/i386/memcopy.h @@ -18,9 +18,6 @@ #include -#undef OP_T_THRES -#define OP_T_THRES 8 - #undef BYTE_COPY_FWD #define BYTE_COPY_FWD(dst_bp, src_bp, nbytes) \ do { \ diff --git a/sysdeps/i386/string-opthr.h b/sysdeps/i386/string-opthr.h new file mode 100644 index 0000000000..ed3e4b2ddb --- /dev/null +++ b/sysdeps/i386/string-opthr.h @@ -0,0 +1,25 @@ +/* Define a threshold for word access. i386 version. + Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef I386_STRING_OPTHR_H +#define I386_STRING_OPTHR_H 1 + +/* Threshold value for when to enter the unrolled loops. */ +#define OP_T_THRES 8 + +#endif /* I386_STRING_OPTHR_H */ diff --git a/sysdeps/m68k/memcopy.h b/sysdeps/m68k/memcopy.h index accd81c1c3..610577071d 100644 --- a/sysdeps/m68k/memcopy.h +++ b/sysdeps/m68k/memcopy.h @@ -20,9 +20,6 @@ #if defined(__mc68020__) || defined(mc68020) -#undef OP_T_THRES -#define OP_T_THRES 16 - /* WORD_COPY_FWD and WORD_COPY_BWD are not symmetric on the 68020, because of its weird instruction overlap characteristics. */ diff --git a/sysdeps/powerpc/powerpc32/power4/memcopy.h b/sysdeps/powerpc/powerpc32/power4/memcopy.h index 384f33b029..872157e485 100644 --- a/sysdeps/powerpc/powerpc32/power4/memcopy.h +++ b/sysdeps/powerpc/powerpc32/power4/memcopy.h @@ -50,11 +50,6 @@ [I fail to understand. I feel stupid. --roland] */ - -/* Threshold value for when to enter the unrolled loops. */ -#undef OP_T_THRES -#define OP_T_THRES 16 - /* Copy exactly NBYTES bytes from SRC_BP to DST_BP, without any assumptions about alignment of the pointers. */ #undef BYTE_COPY_FWD From patchwork Wed Jan 11 20:45:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641265 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458139pvb; Wed, 11 Jan 2023 12:47:13 -0800 (PST) X-Google-Smtp-Source: AMrXdXvJOOuJjApkYkPZ/4KILFTzRgxDF+aUKv/tryz+eHBIdLIYJ+aeSwJdnSKocDWSdeNnH345 X-Received: by 2002:a17:907:c242:b0:7c4:fc02:46a3 with SMTP id tj2-20020a170907c24200b007c4fc0246a3mr74697449ejc.30.1673470033322; Wed, 11 Jan 2023 12:47:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470033; cv=none; d=google.com; s=arc-20160816; b=BNfEC33qC3urmyCwgO0w7JThZZJNnD7HD/YoeRLMl+FVgNyH/Wtxpn6FBI+hrsPGfQ 4AibijX12JzRIkxugIKiCZFU00mnOO+37JegAKs8u3t3pSgfqp52J4BMHf72eT6CuJV/ 8PIbNFDsf+chi2UNj5SO6gvo0gAnawZ8WlRd02QvHlIWYjdUfAFFMmyNUbwMOji5W2Kt 54o+yZVQ7d6rY+kaka/OU2cUabWOT2UAoDHdSIVfZXheU1XG3iV5fLbDexD4VQ+QgUDw VjLyhlcb4rRGYd5JuI/0wDacgt1zqOpVsyK9qniH4w00R7+YmhB+1MMmOugdPJrnJC/J sXzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=PddHxuIFSkRHPD25pU6F1VW7PwC2Yf02K6BCpa6PwPc=; b=x0oopBxC8jBBm84Yzypq7n2ehjvOpDkN6P44hBRTokA+O6DStW8WhJI9XciAHX6gRd ccGeFYv1C0I+XIfXBcnC8XuPGdm++jONVjpQ8+R9Fzl49y7PYuqaH5R4mVUrgOvZPpIt 45N2ogIV8crcdf6pxC1YAFOcebNeiiB4T93s6DAJQ8YiqWMrmcz8zBOJMzVt1P6MnsAd XRaELPCP29I+vRD5kOeBjXlh/FBpSkCjliIlWr6+3lSz6sk5ZoSAa9zW4D9JeXsz7pWM xIqg7dG/2bN5p5aLcDoO/hiL/4OEvyZ6/rFIYyTe1H9ia4S4wpu3prrtWuix8I/pAtIb BflQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=UDXrrlkK; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id xb6-20020a170907070600b007af041e40b4si16174144ejb.129.2023.01.11.12.47.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:13 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=UDXrrlkK; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F3CFF3881D07 for ; Wed, 11 Jan 2023 20:47:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F3CFF3881D07 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470032; bh=PddHxuIFSkRHPD25pU6F1VW7PwC2Yf02K6BCpa6PwPc=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=UDXrrlkKQqmqCQgOudbP+Y2XwXvZVF7XE+Q7fqA3SeBDLHJV4zn3P8Fl5Hk9zWdXW U8q9k/gMJq4ir/a0olKs5+JBE/4kZECjhCK/67KaMQy4FgHVONfDJ3xBEjMPWnZs98 edSZ1HDmBF7qjKBnsG9bUhWZ8wYjpmZCdZFhFQco= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 7B9553858C31 for ; Wed, 11 Jan 2023 20:46:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B9553858C31 Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-12c8312131fso16851188fac.4 for ; Wed, 11 Jan 2023 12:46:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PddHxuIFSkRHPD25pU6F1VW7PwC2Yf02K6BCpa6PwPc=; b=uPdN2XSpc/IbLdFU2JXeOClhXUluZ4kiSLO0tQpRYdiDzUoMU0b9lEgNivZ6SgI9KN 1bZGhT97tepLBoPphgmhbcPCgvJdBsGN5dGWCwCDQyBkiZs8dYK1J9S6pUQEn4WhgBlu +mWJHDSDa7p5/q209wJyuLNM/P5vaU6IpINm1GzU2RvRftNS5VAu/R3s7S8jnWHSMuN1 d/PaqVq7gRh3/5iTXugW4k2AvOqg2O2nMV0XRDuZtrEsoEDTntFvIpbix24yfbfgBDCn PDRGm4BUkzwrkiS8H/lyt/XfChP7dMypiUPbz+HiZV4CoYIWe0NwUV12bbNc29UZtpMt bJkQ== X-Gm-Message-State: AFqh2kohk3csLkC4mZX5WnQkLsDYA5cdGN/3u8uZ+f7oIR8d/HD9aus9 JpxxoQcYxUjMreoKDgw30gbpA4MFZnhT0Ssubkk= X-Received: by 2002:a05:6870:610f:b0:14f:b4be:2f70 with SMTP id s15-20020a056870610f00b0014fb4be2f70mr38273199oae.9.1673469970232; Wed, 11 Jan 2023 12:46:10 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:09 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 03/17] Add string-maskoff.h generic header Date: Wed, 11 Jan 2023 17:45:44 -0300 Message-Id: <20230111204558.2402155-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Macros to operate on unaligned access for string operations: - create_mask: create a mask based on pointer alignment to sets up non-zero bytes before the beginning of the word so a following operation (such as find zero) might ignore these bytes. - repeat_bytes: setup an word with each byte being c_in. - highbit_mask: create a mask with high bit of each byte being 1, and the low 7 bits being all the opposite of the input. - word_containing: return the address of the op_t word containing the addres. These macros are meant to be used on optimized vectorized string implementations. --- sysdeps/generic/string-maskoff.h | 73 ++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) create mode 100644 sysdeps/generic/string-maskoff.h diff --git a/sysdeps/generic/string-maskoff.h b/sysdeps/generic/string-maskoff.h new file mode 100644 index 0000000000..73edd5ad0f --- /dev/null +++ b/sysdeps/generic/string-maskoff.h @@ -0,0 +1,73 @@ +/* Mask off bits. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_MASKOFF_H +#define _STRING_MASKOFF_H 1 + +#include +#include +#include +#include + +/* Provide a mask based on the pointer alignment that sets up non-zero + bytes before the beginning of the word. It is used to mask off + undesirable bits from an aligned read from an unaligned pointer. + For instance, on a 64 bits machine with a pointer alignment of + 3 the function returns 0x0000000000ffffff for LE and 0xffffff0000000000 + (meaning to mask off the initial 3 bytes). */ +static __always_inline op_t +create_mask (uintptr_t i) +{ + i = i % sizeof (op_t); + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return ~(((op_t)-1) << (i * CHAR_BIT)); + else + return ~(((op_t)-1) >> (i * CHAR_BIT)); +} + +/* Setup an word with each byte being c_in. For instance, on a 64 bits + machine with input as 0xce the functions returns 0xcececececececece. */ +static __always_inline op_t +repeat_bytes (unsigned char c_in) +{ + return ((op_t)-1 / 0xff) * c_in; +} + +/* Based on mask created by 'create_mask', mask off the high bit of each + byte in the mask. It is used to mask off undesirable bits from an + aligned read from an unaligned pointer, and also taking care to avoid + match possible bytes meant to be matched. For instance, on a 64 bits + machine with a mask created from a pointer with an alignment of 3 + (0x0000000000ffffff) the function returns 0x7f7f7f0000000000 for BE + and 0x00000000007f7f7f for LE. */ +static __always_inline op_t +highbit_mask (op_t m) +{ + return m & repeat_bytes (0x7f); +} + +/* Return the address of the op_t word containing the address P. For + instance on address 0x0011223344556677 and op_t with size of 8, + it returns 0x0011223344556670. */ +static __always_inline op_t * +word_containing (char const *p) +{ + return (op_t *) ((uintptr_t) p & -sizeof (op_t)); +} + +#endif /* _STRING_MASKOFF_H */ From patchwork Wed Jan 11 20:45:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641269 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458383pvb; Wed, 11 Jan 2023 12:47:57 -0800 (PST) X-Google-Smtp-Source: AMrXdXtKUw+Sm9rQZUENWqOm1HCXiXDDZdaToiinfKuyVfqkomC3h3mY/IPGz6J9chjqLm9HzBx4 X-Received: by 2002:a17:907:6f09:b0:7f7:a985:1849 with SMTP id sy9-20020a1709076f0900b007f7a9851849mr66461097ejc.24.1673470077692; Wed, 11 Jan 2023 12:47:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470077; cv=none; d=google.com; s=arc-20160816; b=dgmTnGwLQBkaCUF+UEEt0dZpxXzhy+6p7ghsVW1NU1ADB2UmA7j6Phm2Kylem2uPBJ pFqyiC5ZerxDHZbzNJ2WlaBUN64v4qnUcBRG0aNpYE27fqO7TpABjlEOW/AdIUOOtNgi +9YCx7OB/T6cn/M5viNf2Mgo5iUmNQvAEanbb+SfPQPjZj8tXGgiBSC07SFNR2fa73Cg K7qzPNHy2hbFuY5MvY2fgA6GB2zBqBwCNBUvDBC91Qcw5CHl7+V0JvqTnf/47Lha7/qZ X/gdE8mBPOhEHZ1K7U8ckLgyva5zoHZbfKs63ITBw88KqA18ZPqhJ+Dx1MqtBuQSPIEN BzPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=SemmoZ/YzYOxXfV6DSMdajTdadr2l1a7oWV4LCZGUX4=; b=wu8ZPIIwRMJa9rGShesFc0KMEvBU9AugqvyGdA8QKrfAPypzVrGKaGaIhW8x0jEX4m wKnLTC9ZavWXty+ze23ZjX8XZmAm9WZWyM5X+VI7XA4hAVTRoEdGjmnIvElIRLtfYTfW Y1eIRrpJj8Pxwy5NCvpPDOjka4ux14QMNS3XcCRIVbNUK+0dRBgx8djxjKRmpFQW2Ztr BTrXrPkrjOnGRCngRpTfdsXZrFOPrzLxyL4fKRxqLDfQryelHsHYLhtITioLi4plnNJf wQ6sBDJAAPyNQGHzXn+dsHWGHvAnyAaDJZLPIvnACduzFpWDDkjDC3en+sqHgg1+3oWF +QxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="VqBz9/2i"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y7-20020a056402440700b0048cb4ab0e23si18713376eda.295.2023.01.11.12.47.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:57 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="VqBz9/2i"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 530EF3842405 for ; Wed, 11 Jan 2023 20:47:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 530EF3842405 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470076; bh=SemmoZ/YzYOxXfV6DSMdajTdadr2l1a7oWV4LCZGUX4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=VqBz9/2imQ3CJ55EC8tKFyP6lyaWpH73/SBxJQZCaUe8rR6+9YmZ73Q+wRfqoNGrt UB0unMDLnOkeDes9V8uIsbLTw/7gZPg5I8+amD2kOKS+dcQaLDbb1FwnTg3zR0nFuO xWTlZ8qmMaf1D/JUa8ufxhTaSPI00n/v6FRq+YpI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id 37788385B50F for ; Wed, 11 Jan 2023 20:46:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 37788385B50F Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-150debe2b7cso16916612fac.0 for ; Wed, 11 Jan 2023 12:46:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SemmoZ/YzYOxXfV6DSMdajTdadr2l1a7oWV4LCZGUX4=; b=o72Wg3pdTaSDVjQro5emSTlWqWXKlEU33tAB2EXwg6DFB022yn495RHnP44LWleyBX 0szxst02ZjngmTjlGbbmA80jlIgvP/Wmy2gN3J6TWsCVKw8y0V1zt/RYp1PI1YI1/pZ9 NyKKQhwW1qBLbCnJ1ZmxkVWUNeqONT6qg08ulBXmmTPxwprQxgHmiVfHCyUxHUxUPVgf y/lby+Po7hhbQaWroYHmYqpuzcT1fHKgJnkHX4f0Bx2EwBvVn9Kaxww3Zh6f144yPCaF TF5Z1Adk8169KbV07RgOlmotK1Y00lGqPazobxgP1lZjcUG3vGCNL96L9/8CAr4wQqG7 kZ5Q== X-Gm-Message-State: AFqh2koql/anqlwHU/3iB2R/h9HZtha/1v5NU1RKL1JYJv9T2+funIOv YN8362lOHvY7Hoy2rvVxpyFvfxWQDhcQ6CtS5DU= X-Received: by 2002:a05:6870:b4a4:b0:15b:9ee2:4c36 with SMTP id y36-20020a056870b4a400b0015b9ee24c36mr5853383oap.35.1673469972390; Wed, 11 Jan 2023 12:46:12 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:11 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 04/17] Add string vectorized find and detection functions Date: Wed, 11 Jan 2023 17:45:45 -0300 Message-Id: <20230111204558.2402155-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto This patch adds generic string find and detection meant to be used in generic vectorized string implementation. The idea is to decompose the basic string operation so each architecture can reimplement if it provides any specialized hardware instruction. The 'string-fza.h' provides zero byte detection functions (find_zero_low, find_zero_all, find_eq_low, find_eq_all, find_zero_eq_low, find_zero_eq_all, find_zero_ne_low, and find_zero_ne_all). They are used on both functions provided by 'string-fzb.h' and 'string-fzi'. The 'string-fzb.h' provides boolean zero byte detection with the functions: - has_zero: determine if any byte within a word is zero. - has_eq: determine byte equality between two words. - has_zero_eq: determine if any byte within a word is zero along with byte equality between two words. The 'string-fzi.h' provides zero byte detection along with its positions: - index_first_zero: return index of first zero byte within a word. - index_first_eq: return index of first byte different between two words. - index_first_zero_eq: return index of first zero byte within a word or first byte different between two words. - index_first_zero_ne: return index of first zero byte within a word or first byte equal between two words. - index_last_zero: return index of last zero byte within a word. - index_last_eq: return index of last byte different between two words. Co-authored-by: Richard Henderson --- sysdeps/generic/string-extbyte.h | 37 +++++++++ sysdeps/generic/string-fza.h | 116 ++++++++++++++++++++++++++ sysdeps/generic/string-fzb.h | 49 +++++++++++ sysdeps/generic/string-fzi.h | 135 +++++++++++++++++++++++++++++++ 4 files changed, 337 insertions(+) create mode 100644 sysdeps/generic/string-extbyte.h create mode 100644 sysdeps/generic/string-fza.h create mode 100644 sysdeps/generic/string-fzb.h create mode 100644 sysdeps/generic/string-fzi.h diff --git a/sysdeps/generic/string-extbyte.h b/sysdeps/generic/string-extbyte.h new file mode 100644 index 0000000000..38b4674dca --- /dev/null +++ b/sysdeps/generic/string-extbyte.h @@ -0,0 +1,37 @@ +/* Extract by from memory word. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_EXTBYTE_H +#define _STRING_EXTBYTE_H 1 + +#include +#include +#include + +/* Extract the byte at index IDX from word X, with index 0 being the + least significant byte. */ +static __always_inline unsigned char +extractbyte (op_t x, unsigned int idx) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return x >> (idx * CHAR_BIT); + else + return x >> (sizeof (x) - 1 - idx) * CHAR_BIT; +} + +#endif /* _STRING_EXTBYTE_H */ diff --git a/sysdeps/generic/string-fza.h b/sysdeps/generic/string-fza.h new file mode 100644 index 0000000000..32bb9f06c2 --- /dev/null +++ b/sysdeps/generic/string-fza.h @@ -0,0 +1,116 @@ +/* Basic zero byte detection. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include +#include + +/* The function return a byte mask. */ +typedef unsigned long int find_t; + +/* Return the mask WORD shifted based on S_INT address value, to ignore + values not presented in the aligned word read. */ +static __always_inline find_t +shift_find (find_t word, uintptr_t s) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + return word >> (CHAR_BIT * (s % sizeof (op_t))); + else + return word << (CHAR_BIT * (s % sizeof (op_t))); +} + +/* This function returns non-zero if any byte in X is zero. + More specifically, at least one bit set within the least significant + byte that was zero; other bytes within the word are indeterminate. */ +static __always_inline find_t +find_zero_low (op_t x) +{ + /* This expression comes from + https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + Subtracting 1 sets 0x80 in a byte that was 0; anding ~x clears + 0x80 in a byte that was >= 128; anding 0x80 isolates that test bit. */ + op_t lsb = repeat_bytes (0x01); + op_t msb = repeat_bytes (0x80); + return (x - lsb) & ~x & msb; +} + +/* This function returns at least one bit set within every byte of X that + is zero. The result is exact in that, unlike find_zero_low, all bytes + are determinate. This is usually used for finding the index of the + most significant byte that was zero. */ +static __always_inline find_t +find_zero_all (op_t x) +{ + /* For each byte, find not-zero by + (0) And 0x7f so that we cannot carry between bytes, + (1) Add 0x7f so that non-zero carries into 0x80, + (2) Or in the original byte (which might have had 0x80 set). + Then invert and mask such that 0x80 is set iff that byte was zero. */ + op_t m = repeat_bytes (0x7f); + return ~(((x & m) + m) | x | m); +} + +/* With similar caveats, identify bytes that are equal between X1 and X2. */ +static __always_inline find_t +find_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1 ^ x2); +} + +static __always_inline find_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + equal between in X1 and X2. */ +static __always_inline find_t +find_zero_eq_low (op_t x1, op_t x2) +{ + return find_zero_low (x1) | find_zero_low (x1 ^ x2); +} + +static __always_inline find_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* With similar caveats, identify zero bytes in X1 and bytes that are + not equal between in X1 and X2. */ +static __always_inline find_t +find_zero_ne_low (op_t x1, op_t x2) +{ + return (~find_zero_eq_low (x1, x2)) + 1; +} + +static __always_inline find_t +find_zero_ne_all (op_t x1, op_t x2) +{ + op_t m = repeat_bytes (0x7f); + op_t eq = x1 ^ x2; + op_t nz1 = ((x1 & m) + m) | x1; + op_t ne2 = ((eq & m) + m) | eq; + return (ne2 | ~nz1) & ~m; +} + +#endif /* _STRING_FZA_H */ diff --git a/sysdeps/generic/string-fzb.h b/sysdeps/generic/string-fzb.h new file mode 100644 index 0000000000..42de500d67 --- /dev/null +++ b/sysdeps/generic/string-fzb.h @@ -0,0 +1,49 @@ +/* Zero byte detection, boolean. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return find_zero_low (x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return find_eq_low (x1, x2) != 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return find_zero_eq_low (x1, x2); +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h new file mode 100644 index 0000000000..b1fd4d34b3 --- /dev/null +++ b/sysdeps/generic/string-fzi.h @@ -0,0 +1,135 @@ +/* Zero byte detection; indexes. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H 1 + +#include +#include +#include + +static __always_inline int +clz (op_t c) +{ + if (sizeof (op_t) == sizeof (unsigned long)) + return __builtin_clzl (c); + else + return __builtin_clzll (c); +} + +static __always_inline int +ctz (op_t c) +{ + if (sizeof (op_t) == sizeof (unsigned long)) + return __builtin_ctzl (c); + else + return __builtin_ctzll (c); +} + +/* A subroutine for the index_zero functions. Given a test word C, return + the (memory order) index of the first byte (in memory order) that is + non-zero. */ +static __always_inline unsigned int +index_first (op_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = ctz (c); + else + r = clz (c); + return r / CHAR_BIT; +} + +/* Similarly, but return the (memory order) index of the last byte that is + non-zero. */ +static __always_inline unsigned int +index_last (op_t c) +{ + int r; + if (__BYTE_ORDER == __LITTLE_ENDIAN) + r = clz (c); + else + r = ctz (c); + return sizeof (op_t) - 1 - (r / CHAR_BIT); +} + +/* Given a word X that is known to contain a zero byte, return the index of + the first such within the word in memory order. */ +static __always_inline unsigned int +index_first_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_low (x); + else + x = find_zero_all (x); + return index_first (x); +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_eq_low (x1, x2); + else + x1 = find_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or equality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_eq_low (x1, x2); + else + x1 = find_zero_eq_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but perform the search for zero within X1 or inequality between + X1 and X2. */ +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x1 = find_zero_ne_low (x1, x2); + else + x1 = find_zero_ne_all (x1, x2); + return index_first (x1); +} + +/* Similarly, but search for the last zero within X. */ +static __always_inline unsigned int +index_last_zero (op_t x) +{ + if (__BYTE_ORDER == __LITTLE_ENDIAN) + x = find_zero_all (x); + else + x = find_zero_low (x); + return index_last (x); +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* STRING_FZI_H */ From patchwork Wed Jan 11 20:45:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641264 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458102pvb; Wed, 11 Jan 2023 12:47:06 -0800 (PST) X-Google-Smtp-Source: AMrXdXuirLWQ9u2E51Bfyml/sTuVVHRRelVu+uthbcewCvtUJl0pn5xILQNpKl9rQr7Mze4G/ct5 X-Received: by 2002:a05:6402:4441:b0:47f:6531:dee9 with SMTP id o1-20020a056402444100b0047f6531dee9mr60065426edb.20.1673470026721; Wed, 11 Jan 2023 12:47:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470026; cv=none; d=google.com; s=arc-20160816; b=MYRdNT8rJdqx/rtxRAYcxhQPo+QCJC0uVQ/wA2ojhFTm2sUTfBYmIN0wdUn/63efTJ y1jqz2X2aj7zwAX6Cx4bt7S+BrSDtyzWXSYXimZwxAkD4WVI/qkpwdaC4SyVbK1Koi0n Kgjv8opDtb7OAqsSqdQPWJUcr5heYUWxlPmrZ9Lblpmh50msQBXBMZo5Ilir0Uj1xAgZ ZnFJjJkzk2QXpfrwASQi70K++JkMLydKFSNf5rCCldyxIze38VkoPmucU8iEYhwDGafi sEUTqz/f9+J2SyvRzHca3h3JCSoViwCSetMAFYQdYvHXJChtOD28RrpH6XGwvcA4kqCc c0pA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; b=K8hPS6LQJXI62HpaXy7Y8oQyWsrlzgNPYWwyAV1lO0jr6vHQpuzxhAreuGHZHJgUr+ JW/hFXd5Ngtf3Jeiu7mWGXf6uQULCoakBGiOG5oWfTL+v8a8MrtKYTXriGsn2WmHMHnr SFxE0WPBjIlzqPOquFx2LMTyvtLfibJudCgXSH0uhWnMIlcQ9fK8DjATtp/O24M/U6t1 8GstfFariVGj1GtagNv4ZrVT+6U/gaDCaa8Zq2O6I5lsidz51V0jN7jSJjb19HSf+3K3 khroIIac0v7hcizXvkm62mts59cH8T8qs7kPFpLpaaIui1tfZWkWxcnDurD9ziRSKhVW 3mrQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=cAKRSWtq; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id w13-20020a05640234cd00b0048ddbdac278si18260234edc.32.2023.01.11.12.47.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:06 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=cAKRSWtq; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B3360384F037 for ; Wed, 11 Jan 2023 20:47:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B3360384F037 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470024; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=cAKRSWtqxXElLZIdxR09/j6YFd7EKpv0sJlH3BjW3I5+eswAAmtP6XNFPr2acKBLY 8lzftCThZxZdXoE5Ncgy3jw/Iwl1LqBklkW/Bvc4zM2Twn3a5VrvF4/OuOSeABAhkv 3qClDk1dnfpJSqueUBvOZ7AEPezQkpRCSb1o8YFo= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x36.google.com (mail-oa1-x36.google.com [IPv6:2001:4860:4864:20::36]) by sourceware.org (Postfix) with ESMTPS id 387C6385B530 for ; Wed, 11 Jan 2023 20:46:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 387C6385B530 Received: by mail-oa1-x36.google.com with SMTP id 586e51a60fabf-1322d768ba7so16822121fac.5 for ; Wed, 11 Jan 2023 12:46:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k8tqbHLdBSZ3p68T8t3IP3XZ3ntWWpTLKxx6X1fIzvo=; b=DEuGg9QcOeWS2VvGhgmi4FzxSor2I9P3We4Sfvu9a17g9V9KOr88wvAYoGFNaovZSl 3mD2m5BFA13AsorVu4H2ZF3sqmGOLHgvIY7bXTDf2FcgnCmvN9msHml1EN4mqiFPj2yI x2GpPseVnrcKFdHKgtVwBx5Tj1Q3lEHMTcAMGJCICg1GKWJhd35arRaoTl3YiFFmBCSl lhxGt9u/e3kjZcTZw7ARa83TZgmCkh+06GaPtI7Bf+CINC3hUAkFBWLPw7fJN+S4u7U1 TfEwfvQhKKnciEg3koOj/INSOlyIwMqIIkXDU4yNDXfm3eDMrlQ9Fvs10U+4wGFCjoIt ZDeQ== X-Gm-Message-State: AFqh2kp1mveUFaamvHnTKqWsVyPwdTTC2RPFxmFuBG/fmRnc2UyWnPUE PRAFsGHX68opYHZ8j1GVHc6Y8CXZ5do+FrCrFk8= X-Received: by 2002:a05:6870:ac21:b0:142:1837:9008 with SMTP id kw33-20020a056870ac2100b0014218379008mr43600185oab.39.1673469974828; Wed, 11 Jan 2023 12:46:14 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:13 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 05/17] string: Improve generic strlen Date: Wed, 11 Jan 2023 17:45:46 -0300 Message-Id: <20230111204558.2402155-6-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff functions to remove unwanted data. This strategy follow arch-specific optimization used on powerpc, sparc, and SH. - Use of has_zero and index_first_zero parametrized functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powercp64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strlen.c | 90 +++++++++-------------------------------- sysdeps/s390/strlen-c.c | 10 +++-- 2 files changed, 26 insertions(+), 74 deletions(-) diff --git a/string/strlen.c b/string/strlen.c index ee1aae0fff..a69f3343ef 100644 --- a/string/strlen.c +++ b/string/strlen.c @@ -17,84 +17,34 @@ #include #include - -#undef strlen - -#ifndef STRLEN -# define STRLEN strlen +#include +#include +#include +#include +#include + +#ifdef STRLEN +# define __strlen STRLEN #endif /* Return the length of the null-terminated string STR. Scan for the null terminator quickly by testing four bytes at a time. */ size_t -STRLEN (const char *str) +__strlen (const char *str) { - const char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - return char_ptr - str; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + /* Align pointer to sizeof op_t. */ + const uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - longword_ptr = (unsigned long int *) char_ptr; + /* Read and MASK the first word. */ + op_t word = *word_ptr | create_mask (s_int); - /* Computing (longword - lomagic) sets the high bit of any corresponding - byte that is either zero or greater than 0x80. The latter case can be - filtered out by computing (~longword & himagic). The final result - will always be non-zero if one of the bytes of longword is zero. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); + while (! has_zero (word)) + word = *++word_ptr; - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - longword = *longword_ptr++; - - if (((longword - lomagic) & ~longword & himagic) != 0) - { - /* Which of the bytes was the zero? */ - - const char *cp = (const char *) (longword_ptr - 1); - - if (cp[0] == 0) - return cp - str; - if (cp[1] == 0) - return cp - str + 1; - if (cp[2] == 0) - return cp - str + 2; - if (cp[3] == 0) - return cp - str + 3; - if (sizeof (longword) > 4) - { - if (cp[4] == 0) - return cp - str + 4; - if (cp[5] == 0) - return cp - str + 5; - if (cp[6] == 0) - return cp - str + 6; - if (cp[7] == 0) - return cp - str + 7; - } - } - } + return ((const char *) word_ptr) + index_first_zero (word) - str; } +#ifndef STRLEN +weak_alias (__strlen, strlen) libc_hidden_builtin_def (strlen) +#endif diff --git a/sysdeps/s390/strlen-c.c b/sysdeps/s390/strlen-c.c index b829ef2452..0a33a6f8e5 100644 --- a/sysdeps/s390/strlen-c.c +++ b/sysdeps/s390/strlen-c.c @@ -21,12 +21,14 @@ #if HAVE_STRLEN_C # if HAVE_STRLEN_IFUNC # define STRLEN STRLEN_C +# endif + +# include + +# if HAVE_STRLEN_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); +__hidden_ver1 (__strlen_c, __GI_strlen, __strlen_c); # endif # endif -# include #endif From patchwork Wed Jan 11 20:45:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641263 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458019pvb; Wed, 11 Jan 2023 12:46:52 -0800 (PST) X-Google-Smtp-Source: AMrXdXt6W6nG3yjAFaCYtFh8XSdufKdMWk5PsJ+MWCSBwsFSO7nPh7dQUwjMJA+KUB53qtzCQpWS X-Received: by 2002:a17:907:8b98:b0:840:a6a3:41c7 with SMTP id tb24-20020a1709078b9800b00840a6a341c7mr66500161ejc.50.1673470011806; Wed, 11 Jan 2023 12:46:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470011; cv=none; d=google.com; s=arc-20160816; b=oq5AyZ4KAaVMdlqrzWg/akmd3qPgVmyhmaz6EgXXXFq1+VFudylKuz+KtIhViFMZFQ bAhWzYRnuA8J8JcsE1jJz+et2xyiF1z0Y+IBGcCFZyTvxnAOo3/caKOZKM7YsWcrYTId n9zzRR76uhmKuymyJg2DLR+0+KjrVuZ6AaIsLj72pakGdFJ3kDt46CHCEhC3txz/w9cD xCifSFbQu0JpPpxDt38rIx5WrsBz457iGRtOTtsCt5bBp7LCsS9P/HpowG88ZX9ubjQk D/jLX33ll6MxrwWUxQGuKuTulb46XyWDp1XecL6KAPnwI/vtHc8XdAc6jRb554VcP5av hQEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; b=qwoZZ71kTeQGd6dRFx3mwVgnbZ3SedRmdunYhrZuAsmCLUTK7uWVD6BqwlgpgpU8Tu seP+FLRSbZPsclXrX28u8QuTLuThoRGuACngxyQKwgEaIU4F3k24SRECxKEFl1C8HEvT voC75c3xy4oW23gXbI9CM2bVodeP8XxwQP+Mz9JwLqTDLeoS/9oxyFN2JJZ/5QwLxsnY 9Uy99BkNEpcCCJn3DJqkCfDuy0PEc+ZyCCC8ejDqj8sQnsJmIyBj1WrysOVIBVsYfhBM Nu0nnRP6rZ5gjvujlmBsvwk8H8+imOuT6p8gY3exckvGItPW/opRGYAaDBkUQkV1ly39 jpSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Stuki10P; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id hq40-20020a1709073f2800b0084d36ffed14si4427357ejc.261.2023.01.11.12.46.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:51 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Stuki10P; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B573938493EF for ; Wed, 11 Jan 2023 20:46:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B573938493EF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470010; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Stuki10PT5YiuUWKcVgiAK4bYGGfRDzzrkGX7OW7kO5sOScUPHc+Ltcnu2Jy9zOZa XX580U6aiVUrOmTh+DcL0eTAwAjnb83p+GXfeWkncRAzxxeV8IjIrrTciCA5BF6UcG Rm9E79H6Ja1YRB7SdgFKkMAK9Ba+G3pj/ALYNUrM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) by sourceware.org (Postfix) with ESMTPS id 3FE74385B52D for ; Wed, 11 Jan 2023 20:46:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3FE74385B52D Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-1433ef3b61fso16794676fac.10 for ; Wed, 11 Jan 2023 12:46:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lmtxRpsTK2s7EUw/8kxbXxelnWJd50VsJ1oOFlPHVEk=; b=BC5WXRWufTt1cjzeEpujU3DUl431vfMXmt1v7+DurwjISLHhFRWU62p7LI2Z4NBnEI jkIcayXwCjMDn8iut/YD1Nmcur942u0Inn1qj90dwBp2U+Zw6Y4kF0WivlhsrUhNInPo FjK3u2cRHrr0lMuOqZE+zA7PYkXejCPU9TF8BohFDm483PaAdbMrgzz0/Bqi6kPAuuFP bXXC+ku3GhGS66GNSUT4C4Ly57whZE0q3uLRCvIY3/QL4Jpd6rX8mu1FpgklyEipV1sW 7mSaQsumIf3wFqpHvUwiUD6P2UxZIekXkWDp8Bz4wn8+rj/IJF4gaSl5UI1MNwcWBazY 6fRQ== X-Gm-Message-State: AFqh2kodc9to+eiehEmf5wHhMdmqqeGOJCs2JppIg4FZzPvGz2y6SYg0 PkDx4LpdfUwXq+zRKMY7TAsAH5GhRdjFhIArZ8k= X-Received: by 2002:a05:6870:b97:b0:144:efca:99f3 with SMTP id lg23-20020a0568700b9700b00144efca99f3mr1693403oab.3.1673469976910; Wed, 11 Jan 2023 12:46:16 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:16 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 06/17] string: Improve generic strnlen Date: Wed, 11 Jan 2023 17:45:47 -0300 Message-Id: <20230111204558.2402155-7-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto With an optimized memchr, new strnlen implementation basically calls memchr and adjust the result pointer value. It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Co-authored-by: Richard Henderson --- string/strnlen.c | 137 +----------------- sysdeps/i386/i686/multiarch/strnlen-c.c | 14 +- .../power4/multiarch/strnlen-ppc32.c | 14 +- sysdeps/s390/strnlen-c.c | 14 +- 4 files changed, 27 insertions(+), 152 deletions(-) diff --git a/string/strnlen.c b/string/strnlen.c index 6ff294eab1..dc23354ec8 100644 --- a/string/strnlen.c +++ b/string/strnlen.c @@ -1,10 +1,6 @@ /* Find the length of STRING, but scan at most MAXLEN characters. Copyright (C) 1991-2023 Free Software Foundation, Inc. - Based on strlen written by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se); - commentary by Jim Blandy (jimb@ai.mit.edu). - The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the @@ -20,7 +16,6 @@ not, see . */ #include -#include /* Find the length of S, but scan at most MAXLEN characters. If no '\0' terminator is found in that many characters, return MAXLEN. */ @@ -32,134 +27,12 @@ size_t __strnlen (const char *str, size_t maxlen) { - const char *char_ptr, *end_ptr = str + maxlen; - const unsigned long int *longword_ptr; - unsigned long int longword, himagic, lomagic; - - if (maxlen == 0) - return 0; - - if (__glibc_unlikely (end_ptr < str)) - end_ptr = (const char *) ~0UL; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = str; ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == '\0') - { - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; - } - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - himagic = 0x80808080L; - lomagic = 0x01010101L; - if (sizeof (longword) > 4) - { - /* 64-bit version of the magic. */ - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - himagic = ((himagic << 16) << 16) | himagic; - lomagic = ((lomagic << 16) << 16) | lomagic; - } - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (longword_ptr < (unsigned long int *) end_ptr) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. */ - - longword = *longword_ptr++; - - if ((longword - lomagic) & himagic) - { - /* Which of the bytes was the zero? If none of them were, it was - a misfire; continue the search. */ - - const char *cp = (const char *) (longword_ptr - 1); - - char_ptr = cp; - if (cp[0] == 0) - break; - char_ptr = cp + 1; - if (cp[1] == 0) - break; - char_ptr = cp + 2; - if (cp[2] == 0) - break; - char_ptr = cp + 3; - if (cp[3] == 0) - break; - if (sizeof (longword) > 4) - { - char_ptr = cp + 4; - if (cp[4] == 0) - break; - char_ptr = cp + 5; - if (cp[5] == 0) - break; - char_ptr = cp + 6; - if (cp[6] == 0) - break; - char_ptr = cp + 7; - if (cp[7] == 0) - break; - } - } - char_ptr = end_ptr; - } - - if (char_ptr > end_ptr) - char_ptr = end_ptr; - return char_ptr - str; + const char *found = memchr (str, '\0', maxlen); + return found ? found - str : maxlen; } + #ifndef STRNLEN -libc_hidden_def (__strnlen) weak_alias (__strnlen, strnlen) -#endif +libc_hidden_def (__strnlen) libc_hidden_def (strnlen) +#endif diff --git a/sysdeps/i386/i686/multiarch/strnlen-c.c b/sysdeps/i386/i686/multiarch/strnlen-c.c index 351e939a93..beb0350d53 100644 --- a/sysdeps/i386/i686/multiarch/strnlen-c.c +++ b/sysdeps/i386/i686/multiarch/strnlen-c.c @@ -1,10 +1,10 @@ #define STRNLEN __strnlen_ia32 +#include + #ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); \ - strong_alias (__strnlen_ia32, __strnlen_ia32_1); \ - __hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ia32, __GI_strnlen, __strnlen_ia32); +strong_alias (__strnlen_ia32, __strnlen_ia32_1); +__hidden_ver1 (__strnlen_ia32_1, __GI___strnlen, __strnlen_ia32_1); #endif - -#include "string/strnlen.c" diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c index 957b9b99e8..2ca1cd7181 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strnlen-ppc32.c @@ -17,12 +17,12 @@ . */ #define STRNLEN __strnlen_ppc +#include + #ifdef SHARED -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ - strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ - __hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); +/* Alias for internal symbol to avoid PLT generation, it redirects the + libc_hidden_def (__strnlen/strlen) to default implementation. */ +__hidden_ver1 (__strnlen_ppc, __GI_strnlen, __strnlen_ppc); \ +strong_alias (__strnlen_ppc, __strnlen_ppc_1); \ +__hidden_ver1 (__strnlen_ppc_1, __GI___strnlen, __strnlen_ppc_1); #endif - -#include diff --git a/sysdeps/s390/strnlen-c.c b/sysdeps/s390/strnlen-c.c index 172fcc7caa..95156a0ff5 100644 --- a/sysdeps/s390/strnlen-c.c +++ b/sysdeps/s390/strnlen-c.c @@ -21,14 +21,16 @@ #if HAVE_STRNLEN_C # if HAVE_STRNLEN_IFUNC # define STRNLEN STRNLEN_C +# endif + +# include + +# if HAVE_STRNLEN_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_def -# define libc_hidden_def(name) \ - __hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); \ - strong_alias (__strnlen_c, __strnlen_c_1); \ - __hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); +__hidden_ver1 (__strnlen_c, __GI_strnlen, __strnlen_c); +strong_alias (__strnlen_c, __strnlen_c_1); +__hidden_ver1 (__strnlen_c_1, __GI___strnlen, __strnlen_c_1); # endif # endif -# include #endif From patchwork Wed Jan 11 20:45:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641266 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458143pvb; Wed, 11 Jan 2023 12:47:14 -0800 (PST) X-Google-Smtp-Source: AMrXdXsAw9kVSJAISvyPSi5f4nUPKRvVtAMo6MNow4lEuOgwHcyAvY7OS8gKcIrEBitc6jP4Yaxc X-Received: by 2002:a05:6402:1f09:b0:478:8375:5dd5 with SMTP id b9-20020a0564021f0900b0047883755dd5mr60415422edb.24.1673470034332; Wed, 11 Jan 2023 12:47:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470034; cv=none; d=google.com; s=arc-20160816; b=jbBImtD4iFNwke7506AqOAp5nomQ0cTnYWweZI2AcD6FnUV5+ZN1MoYKGBP4svRPW0 ncloMCu533BVnegurZdvkGdaD+B7xRDXmVxW1DVTVw7xXSlSgiJ8rEeYpnozMORZFLN9 ctv7fbvmWkdJz3565hnjwnGX0iHx/DI8kLNNIy9gCYZe+69HAUpcJonJ6MelVhOQwy9b kGUJlq2SGZHR57qre6vx+c5GF4kqgCH5yEWRXQYL+I56Ne3zsnoK3/JKjjU/nMxhrfVi rRshhMWSDx60xb9T/UlfQ4gLcu9TBsnFFFFr3qb2nPT+p8cjshtBJwhNXCuUBVSeaNQl eePg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; b=nBs9mChJUferbosLvAwRJn93Lun8tw9m9wBkvHvj9+U9296WnqQ2dozU5Fh+2PepJn 6v0fV9Djo0SF9d0lkR91t/4yTdPMURRO/ssTIuFmmvODwXmPmrdV9OQ94CfPUBZaoJRy XDxzwbaMt8coS4mUvAfrYkRwM8LUSF5LBMqRmuYzeMwwFzH3vc9XjXOwQV9WwBd2HKsW PnqRn5E0gLNXvObNEnmKIqjo81QT063Sgs3zqTWUqUHVt9xPEgLEJQ+wrZ2nRRJnA/OR KZQ3y0+avouU1891PS5WMGhJn+dWGzCaN4IkE2qqjnfrnjbfIEDT3tjT3bNOxmiC0MsL 0xow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Yb2i6dYB; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r10-20020a05640251ca00b0048db0687dbasi17081947edd.262.2023.01.11.12.47.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:14 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=Yb2i6dYB; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E5224386EC00 for ; Wed, 11 Jan 2023 20:47:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E5224386EC00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470032; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=Yb2i6dYBa9lDphN4wtM8psoi5K1GzTYYh64yyBxiFOTZ3yylkiyhzBDcnynIky1ed 1/l8E79RUFDyHzBVoeUqWmxqjIZRpK/oDsUwkYLbQTqaB2CwN/E63WEUpoBZF+5X+d x++OGeFoZL5QoYIcz5ZMgmBWsJ2pdEv3sErddS08= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2a.google.com (mail-oa1-x2a.google.com [IPv6:2001:4860:4864:20::2a]) by sourceware.org (Postfix) with ESMTPS id 62FBC385700E for ; Wed, 11 Jan 2023 20:46:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 62FBC385700E Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-1433ef3b61fso16794755fac.10 for ; Wed, 11 Jan 2023 12:46:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JISe1ps6pZqzYvm4zLC0Y5WmtRjg+JfAVvhHw1QSgq8=; b=N0lmzBrnAWlgY3GmvXENl2iH9Cx6m648AIctEHegVDcZ99VQSDelq9vvyC4IUfA04C tAGci1mCNtVCOfPYcsOvOnuDEj+r7W81YIc/M3/Qyygk+fynx/IDtn/lNq3Tj+fJUXOz lkCY6bOXjXszKCa5+7bbnG9mVPA7rTvnY2mswe3ZaxJT0ZWUSHX3jlj3bNVrU+gVjf63 vnhVQMm3mR3nZlpfPTxuH28C36DI5oZYRNfnNroLjz8ZvNqydnmzWGVrNHQ3j9YNaTXw nANEf6lk9JQljmYq9sQquC7SZvCstP3UtGPcOpfPoObARkcd82DX2AOB2/5GLcwRWTLo hZ6A== X-Gm-Message-State: AFqh2kq43NgC8wuEDKlFMgyTP6pb2WbcrakcDHEUmnfQ+RFbP5cwZAbo 8FlDUORvnIHyzEUfXzDht9W5RlwYFMgFeR4PM1M= X-Received: by 2002:a05:6870:9592:b0:148:2c02:5322 with SMTP id k18-20020a056870959200b001482c025322mr37428091oao.26.1673469979098; Wed, 11 Jan 2023 12:46:19 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:18 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 07/17] string: Improve generic strchr Date: Wed, 11 Jan 2023 17:45:48 -0300 Message-Id: <20230111204558.2402155-8-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm now calls strchrnul. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). --- string/strchr.c | 159 ++-------------------------------------- sysdeps/s390/strchr-c.c | 11 +-- 2 files changed, 14 insertions(+), 156 deletions(-) diff --git a/string/strchr.c b/string/strchr.c index 1572b8b42e..30c3eb10f2 100644 --- a/string/strchr.c +++ b/string/strchr.c @@ -21,165 +21,22 @@ . */ #include -#include #undef strchr +#undef index -#ifndef STRCHR -# define STRCHR strchr +#ifdef STRCHR +# define strchr STRCHR #endif /* Find the first occurrence of C in S. */ char * -STRCHR (const char *s, int c_in) +strchr (const char *s, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - else if (*char_ptr == '\0') - return NULL; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (sizeof (longword) > 4) - { - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - if (*++cp == c) - return (char *) cp; - else if (*cp == '\0') - return NULL; - } - } - } - - return NULL; + char *r = __strchrnul (s, c_in); + return (*(unsigned char *)r == (unsigned char)c_in) ? r : NULL; } - -#ifdef weak_alias -# undef index +#ifndef STRCHR weak_alias (strchr, index) -#endif libc_hidden_builtin_def (strchr) +#endif diff --git a/sysdeps/s390/strchr-c.c b/sysdeps/s390/strchr-c.c index c00f2cceea..90822ae0f4 100644 --- a/sysdeps/s390/strchr-c.c +++ b/sysdeps/s390/strchr-c.c @@ -21,13 +21,14 @@ #if HAVE_STRCHR_C # if HAVE_STRCHR_IFUNC # define STRCHR STRCHR_C -# undef weak_alias +# endif + +# include + +# if HAVE_STRCHR_IFUNC # if defined SHARED && IS_IN (libc) -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); +__hidden_ver1 (__strchr_c, __GI_strchr, __strchr_c); # endif # endif -# include #endif From patchwork Wed Jan 11 20:45:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641268 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458368pvb; Wed, 11 Jan 2023 12:47:55 -0800 (PST) X-Google-Smtp-Source: AMrXdXtnGwUXoiYAZetv8s+yIFuUjrIhlMpJmyzw4iEr5NzLBaz5eQd+CO1Wq2KjLKNK3YFfLVRz X-Received: by 2002:a05:6402:4004:b0:48e:a9a2:407 with SMTP id d4-20020a056402400400b0048ea9a20407mr39011774eda.23.1673470075655; Wed, 11 Jan 2023 12:47:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470075; cv=none; d=google.com; s=arc-20160816; b=oFpOoN2dxeXgELmtb+ELrQLUXvcUZtm44sSVJA7Dshs227laMxUBClkJOphWANSoRF bPi9TJy5sWYmDpEk/xme/+k0eHi0t0ippiCXzoOWR1E/1sPx1VBVfVVI+QfbkiZaK+eT XPEHSUieJPnJDaD1WO7WqP85yLn2Fcxk3P7IrPLbsTSzlg7BnRZ9fQNBfdbIm07SMh0s Nk8RNBHLuBSR8/CLxN8w9QmHQXHCu9nBUz8QRhouNEKwaU/K22mVlVcGGOPQ+9grNgyJ ixFINhuUF6mFFzOdG2l9o9rfzjdS3mw0wB07cwIQgK6EOM06KjQrO0oDSMXAetWwgnfI u3aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=cMfkz+2fJxRMporNaoaIU6vB1kPF0U6KLjx6HyPz2bg=; b=N+HYP7nKR3ipMfUDki61lQn/UiFPFMzXltFKB4hPHqGfStJM5sXI/NbKDLAwJZBDOn Q1AklGqrHZ+n6nIvk5C6KBTxg914mI7ltMCztst6u1kN1EyhpmUmCSptt20ySXm2e/gO rignEg4WNNzTL0B3f7GS3HebMA848n8AZNQf7zOLJ5B5263c6xcJzO4Y8fJ4OghLNRZm eiFmHhpHdZ4jtJnWrai3bm7wP/QXAEa7vQ72LIjAfnS+AvwHOsv2TTl9SDbwHAzeXYps OKlnP8DQgjb41OgwHxS5XyRAR1BVHUA2dbO7RirucF3uiU+EsdXG+uhwJabwXLHIbP31 DmLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gi45ZWcx; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id nc28-20020a1709071c1c00b0078df24be362si17550389ejc.496.2023.01.11.12.47.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:55 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=gi45ZWcx; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 293FE3887F6C for ; Wed, 11 Jan 2023 20:47:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 293FE3887F6C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470074; bh=cMfkz+2fJxRMporNaoaIU6vB1kPF0U6KLjx6HyPz2bg=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=gi45ZWcxXZCdNpYFdYxCF+yOooVUerO7w1FcAzAUJFRqJsAxrKJY3IKu4nX/r0jed /kqPxAoSdxXgsm4V57xPbkSoLxXPKXlZCM5LQfN3ZtQ7WEVUaOztry3XIYRnY1zAzx XC00yQGMMVRScTB6NtEP99czro4LP/e/qqGUaRoE= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) by sourceware.org (Postfix) with ESMTPS id 9B8CD3857400 for ; Wed, 11 Jan 2023 20:46:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B8CD3857400 Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-1442977d77dso16826755fac.6 for ; Wed, 11 Jan 2023 12:46:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cMfkz+2fJxRMporNaoaIU6vB1kPF0U6KLjx6HyPz2bg=; b=nIPYl94RoqOXsRNMr4RnwzyjbTdtiMjsipzZ9rk8HfKUMIvlX5IAwd8eDm52LIkwoe ANFT8dvUOsacUZngh6xwYMzwGMjhO8ksHKVHUKX0btXkZpeTimD+cIWUhngXGNGDK3Wg nP9svPuJKjJkAHaGXdGvUob+pbAmopEi8UKzv/mDLGUhy092DjMGk0Mds5PwtrDAeifh aoaKS/Hf+pG/MXPZR036PYu459gK8GAKi1x6NUk/7vIiua7JfH/Ky8k2nuQ70JXAipRD Lkkl7hYB2UW02hVNQo+/MXoILlISe0r3JtF34LGGXBBSVycrM7eKzyoVrp8CcYQ1d4sE UI2g== X-Gm-Message-State: AFqh2kqVxBXLdLblDQl4AeI2WN9AeF2RPhHEj8LrU6g5PaxHKIGneEHE E6euwM/5EeXelgKCwVPmkZeK14vT3hxBvTIk91Q= X-Received: by 2002:a05:6870:5308:b0:15e:ae51:8129 with SMTP id j8-20020a056870530800b0015eae518129mr1090374oan.46.1673469981271; Wed, 11 Jan 2023 12:46:21 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:20 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 08/17] string: Improve generic strchrnul Date: Wed, 11 Jan 2023 17:45:49 -0300 Message-Id: <20230111204558.2402155-9-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strchrnul.c | 153 +++--------------- .../power4/multiarch/strchrnul-ppc32.c | 4 - sysdeps/s390/strchrnul-c.c | 2 - 3 files changed, 22 insertions(+), 137 deletions(-) diff --git a/string/strchrnul.c b/string/strchrnul.c index fa2db4b417..4a82a48314 100644 --- a/string/strchrnul.c +++ b/string/strchrnul.c @@ -1,10 +1,5 @@ /* Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - bug fix and commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to strchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,146 +16,42 @@ . */ #include -#include #include +#include +#include +#include +#include +#include #undef __strchrnul #undef strchrnul -#ifndef STRCHRNUL -# define STRCHRNUL __strchrnul +#ifdef STRCHRNUL +# define __strchrnul STRCHRNUL #endif /* Find the first occurrence of C in S or the final NUL byte. */ char * -STRCHRNUL (const char *s, int c_in) +__strchrnul (const char *str, int c_in) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; - ++char_ptr) - if (*char_ptr == c || *char_ptr == '\0') - return (void *) char_ptr; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ - - longword_ptr = (unsigned long int *) char_ptr; - - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; - if (sizeof (longword) > 4) - /* Do the shift in two steps to avoid a warning if long has 32 bits. */ - charmask |= (charmask << 16) << 16; - if (sizeof (longword) > 8) - abort (); - - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - for (;;) - { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. + op_t repeated_c = repeat_bytes (c_in); - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. + uintptr_t s_int = (uintptr_t) str; + const op_t *word_ptr = word_containing (str); - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! + op_t word = *word_ptr; - So it ignores everything except 128's, when they're aligned - properly. + find_t mask = shift_find (find_zero_eq_all (word, repeated_c), s_int); + if (mask != 0) + return (char *) str + index_first (mask); - 3) But wait! Aren't we looking for C as well as zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ + do + word = *++word_ptr; + while (! has_zero_eq (word, repeated_c)); - longword = *longword_ptr++; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0 - - /* That caught zeroes. Now test for C. */ - || ((((longword ^ charmask) + magic_bits) ^ ~(longword ^ charmask)) - & ~magic_bits) != 0) - { - /* Which of the bytes was C or zero? - If none of them were, it was a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) (longword_ptr - 1); - - if (*cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (sizeof (longword) > 4) - { - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - if (*++cp == c || *cp == '\0') - return (char *) cp; - } - } - } - - /* This should never happen. */ - return NULL; + op_t found = index_first_zero_eq (word, repeated_c); + return (char *) word_ptr + found; } - +#ifndef STRCHRNUL weak_alias (__strchrnul, strchrnul) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c index 88ce5dfffa..da03ac7c04 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/strchrnul-ppc32.c @@ -19,10 +19,6 @@ #include #define STRCHRNUL __strchrnul_ppc - -#undef weak_alias -#define weak_alias(a,b ) - extern __typeof (strchrnul) __strchrnul_ppc attribute_hidden; #include diff --git a/sysdeps/s390/strchrnul-c.c b/sysdeps/s390/strchrnul-c.c index e1248d1dbf..ff6aa38d4f 100644 --- a/sysdeps/s390/strchrnul-c.c +++ b/sysdeps/s390/strchrnul-c.c @@ -22,8 +22,6 @@ # if HAVE_STRCHRNUL_IFUNC # define STRCHRNUL STRCHRNUL_C # define __strchrnul STRCHRNUL -# undef weak_alias -# define weak_alias(name, alias) # endif # include From patchwork Wed Jan 11 20:45:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641273 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458662pvb; Wed, 11 Jan 2023 12:48:44 -0800 (PST) X-Google-Smtp-Source: AMrXdXs9rBvDzZvRTMxTNCMuiLM28npspR8N9jQ6/CIOspYRM+ntHUSAmhe7K7hW2W1UCLynFwje X-Received: by 2002:a50:fb02:0:b0:498:5dfd:6826 with SMTP id d2-20020a50fb02000000b004985dfd6826mr14246439edq.42.1673470124347; Wed, 11 Jan 2023 12:48:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470124; cv=none; d=google.com; s=arc-20160816; b=boxdjkx6qT+LV7HR0blB/9rcTLs2XaqDS5GuCmLh9GKy2x8ze6Mt6MJKd9eU6s66F1 chT1Ih0TjT9l6kmYXopcA4n0quuNE+ex5JvcUtVo8gZQldu45+HR4qO4YIl2oOZkRTh7 ltp5822VU0QjxwUCb9nPSFhroBwJe2iXmqDhY/9zDUsiZ2mqI/nxWVkbnSVuwU+IBoJw usTcnHF4JPxAa14cuLpOxE0TklYp8HxvOqNVi72D8/TmQz+YQx9NZgs65EP52gGYxbBq S08JKePE0QEoApBn3aYq9OBlL8DIhQxG0jw+33q5ZXTIs4fJq/fwWBISxmMXJsnngSk8 rw6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; b=oQXq+AAVgRJ4CaVDCVsiRy76spIjlH8IlqIh8Mvu1zjwJBWUdgNvW74vezqZPLbiDK gOQwUfAqju6JxJsq7UnDZJUceb0LdmiHMTgWDTjYAlauESMkmSxwyaeAbfCb34BZfFJq v4aNEI3/Vd2mzoFABsNVfbRQQ5+vA449WmLlbxvlnFmEUV7SD2fiSsIzlseFj9DxqYV4 acfwmXz31u83eUcj/iutUp83RkUiSNn3CSzZLyaw2e/aTmw7SvltS96NJtAG9qLvhqGK Y0WkfuSBwmQEkE21vBHbnkEAN4SUCEgtY58+vsIsUePtyPZmnPBwoZJH2HgYN2y0kcc6 8oyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="iiDPH/zn"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id l24-20020aa7cad8000000b0048772a84504si3778644edt.591.2023.01.11.12.48.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:48:44 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="iiDPH/zn"; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EBF663832346 for ; Wed, 11 Jan 2023 20:48:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EBF663832346 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470123; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=iiDPH/zndwKLYLn0va6yocBFdzNdn9CAvoGtN3OI+qgkev1Y0tRNam6WYWe+rgLmX oHOAkk32P5aUSvL0J0fjCloCnf/0BJJahOFiaY19lelhwFdK8/4WbalM/yruqgdqcc CHLUOEHzwhprvcj3x/V/Jnxl6QozwxQggDIvKC/U= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x30.google.com (mail-oa1-x30.google.com [IPv6:2001:4860:4864:20::30]) by sourceware.org (Postfix) with ESMTPS id B885C385B517 for ; Wed, 11 Jan 2023 20:46:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B885C385B517 Received: by mail-oa1-x30.google.com with SMTP id 586e51a60fabf-1442977d77dso16826831fac.6 for ; Wed, 11 Jan 2023 12:46:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kaHoGHUD+J1FsTiUTsC2T3sn6pJ9jHe9pe1Jcc6cGuk=; b=Cs4jPXcrhaXOHCzirHfumHLU7petew3rQQqjWcvbjuFQ/gpEKayGgqFGJDPYjCt+m+ 9iSKt8s6EXYvFSvIwp4CvRRwXtGN+SYxWfnkJ6NXawJgWJgFpgCr/OViz0/y+VvGRCyA s/ilENsB88gV9+ICzTQS44KEs1t/QoGPP+URNFgRtgEm8do3ffm/wEiU9olSTYEITy/p J2lXbWcFyxpO4ejUtTtRo/YsVGzSmOlq6sFuWftd7TSwZZy50KtB/e4HpgYTPn1N5YKP 4Q8PHHHWIDqpaRn2UY3JGj3LA0VsubzXLYRrGYn7bdEZ9MOOyq41gXBHr3rePOncCylB suOg== X-Gm-Message-State: AFqh2kovKyV+Wp8HLS+VNmooSZNOEoPNxXNDAE/hfvzvGD7NIwEubGbV duhewylx5KRjgzXAMEfF6JDY0LD37oLZF0BhUiM= X-Received: by 2002:a05:6871:609:b0:15e:af95:d930 with SMTP id w9-20020a056871060900b0015eaf95d930mr1243110oan.37.1673469983402; Wed, 11 Jan 2023 12:46:23 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:22 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 09/17] string: Improve generic strcmp Date: Wed, 11 Jan 2023 17:45:50 -0300 Message-Id: <20230111204558.2402155-10-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New generic implementation tries to use word operations along with the new string-fz{b,i} functions even for inputs with different alignments (with still uses aligned access plus merge operation to get a correct word by word comparison). Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/strcmp.c | 119 +++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 103 insertions(+), 16 deletions(-) diff --git a/string/strcmp.c b/string/strcmp.c index 053f5a8d2b..fafd967567 100644 --- a/string/strcmp.c +++ b/string/strcmp.c @@ -15,33 +15,120 @@ License along with the GNU C Library; if not, see . */ +#include +#include +#include +#include #include +#include -#undef strcmp - -#ifndef STRCMP -# define STRCMP strcmp +#ifdef STRCMP +# define strcmp STRCMP #endif +static inline int +final_cmp (const op_t w1, const op_t w2) +{ + /* It can not use index_first_zero_ne because it must not compare past the + final '/0' is present (and final_cmp is called before has_zero check). + */ + for (size_t i = 0; i < sizeof (op_t); i++) + { + unsigned char c1 = extractbyte (w1, i); + unsigned char c2 = extractbyte (w2, i); + if (c1 == '\0' || c1 != c2) + return c1 - c2; + } + return 0; +} + +/* Aligned loop: if a difference is found, exit to compare the bytes. Else + if a zero is found we have equal strings. */ +static inline int +strcmp_aligned_loop (const op_t *x1, const op_t *x2, op_t w1) +{ + op_t w2 = *x2++; + + while (w1 == w2) + { + if (has_zero (w1)) + return 0; + w1 = *x1++; + w2 = *x2++; + } + + return final_cmp (w1, w2); +} + +/* Unaligned loop: align the first partial of P2, with 0xff for the rest of + the bytes so that we can also apply the has_zero test to see if we have + already reached EOS. If we have, then we can simply fall through to the + final comparison. */ +static inline int +strcmp_unaligned_loop (const op_t *x1, const op_t *x2, op_t w1, uintptr_t ofs) +{ + op_t w2a = *x2++; + uintptr_t sh_1 = ofs * CHAR_BIT; + uintptr_t sh_2 = sizeof(op_t) * CHAR_BIT - sh_1; + + op_t w2 = MERGE (w2a, sh_1, (op_t)-1, sh_2); + if (!has_zero (w2)) + { + op_t w2b; + + /* Unaligned loop. The invariant is that W2B, which is "ahead" of W1, + does not contain end-of-string. Therefore it is safe (and necessary) + to read another word from each while we do not have a difference. */ + while (1) + { + w2b = *x2++; + w2 = MERGE (w2a, sh_1, w2b, sh_2); + if (w1 != w2) + return final_cmp (w1, w2); + if (has_zero (w2b)) + break; + w1 = *x1++; + w2a = w2b; + } + + /* Zero found in the second partial of P2. If we had EOS in the aligned + word, we have equality. */ + if (has_zero (w1)) + return 0; + + /* Load the final word of P1 and align the final partial of P2. */ + w1 = *x1++; + w2 = MERGE (w2b, sh_1, 0, sh_2); + } + + return final_cmp (w1, w2); +} + /* Compare S1 and S2, returning less than, equal to or greater than zero if S1 is lexicographically less than, equal to or greater than S2. */ int -STRCMP (const char *p1, const char *p2) +strcmp (const char *p1, const char *p2) { - const unsigned char *s1 = (const unsigned char *) p1; - const unsigned char *s2 = (const unsigned char *) p2; - unsigned char c1, c2; - - do + /* Handle the unaligned bytes of p1 first. */ + uintptr_t n = -(uintptr_t)p1 % sizeof(op_t); + for (int i = 0; i < n; ++i) { - c1 = (unsigned char) *s1++; - c2 = (unsigned char) *s2++; - if (c1 == '\0') - return c1 - c2; + unsigned char c1 = *p1++; + unsigned char c2 = *p2++; + int diff = c1 - c2; + if (c1 == '\0' || diff != 0) + return diff; } - while (c1 == c2); - return c1 - c2; + /* P1 is now aligned to unsigned long. P2 may or may not be. */ + const op_t *x1 = (const op_t *) p1; + op_t w1 = *x1++; + uintptr_t ofs = (uintptr_t) p2 % sizeof(op_t); + return ofs == 0 + ? strcmp_aligned_loop (x1, (const op_t *)p2, w1) + : strcmp_unaligned_loop (x1, (const op_t *)(p2 - ofs), w1, ofs); } +#ifndef STRCMP libc_hidden_builtin_def (strcmp) +#endif From patchwork Wed Jan 11 20:45:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641267 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458263pvb; Wed, 11 Jan 2023 12:47:41 -0800 (PST) X-Google-Smtp-Source: AMrXdXu0RM7cLXG+Htf0sZkmUq2/rK+HUg8ZxvmTpWopNATecH6R4ln1oj/9nm7miOrfqWa3OsFS X-Received: by 2002:a17:907:a603:b0:859:1d78:765 with SMTP id vt3-20020a170907a60300b008591d780765mr6224282ejc.11.1673470060679; Wed, 11 Jan 2023 12:47:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470060; cv=none; d=google.com; s=arc-20160816; b=M3PDc0G59lyIoNLizOD5192LKhnxv4wh/8oAwc8DGusNisVRq5LCodV77qBAge1g34 JBCaprZ9it7KMkTa+/TkXCDkuujAtPmTrWqgMg4mNKlC2IB/Fp4ohl2Xg2O25D6iVvvO iiZmeG78bDxETu54kTC/YMtRj7QdPV15fzq5tAAh1h5lgVGmkUCDuwcCG0XbAGOGBZKD n69K1Gujz3yYwgUMEjQH9JI6XLuj7UyeGoRDa6PpfWBToF2Lkxewtw3jctxrz1SoUb+5 rUK50adeQDJquWJj1uP9FQ9goybvrYtJ3uXhy0b6mn+8L3Bop6rYwGiS3cct7KR51J+u 0Bqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=gUIw/EMim5UVtbqvtb/jUTtqQzqUqrL8CgehT2lyTU8=; b=gpx6gUnA9XtByKKRTLFvTlWkkDVhKd61U5kTAG3fZ3xAJIQhFNw+6/Vtg8N5SGbLYy dNR9BRBbo/tnJsEuU1QPMxY+3fNwN/Z5Xmo8LnMAVvChpdQnKVRFsdZAl5trI7vzufVG Q2YOB8ofZqNw3RW99bOCFNiF/OjvPoS7YuSkX87wrTQc1vhbgMJGFfevPtj/JOdz7Zhu mSBXEUZc/EoGkPHWpsHvCLiA6n6u7AkM+BsJiT9C0gosOQORaTHI0Md16WmXDky/of48 dZJgghUEJesulF2Kj3LNenKamCxe0jVY6lpEsmO7WKb7ErgWb5UCLBGQUWliiQFZf0OA B3qA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=aKIfK6Ju; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id oz19-20020a1709077d9300b0085738d104fcsi6417743ejc.213.2023.01.11.12.47.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:47:40 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=aKIfK6Ju; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7654D383234F for ; Wed, 11 Jan 2023 20:47:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7654D383234F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470059; bh=gUIw/EMim5UVtbqvtb/jUTtqQzqUqrL8CgehT2lyTU8=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=aKIfK6JulzQWGHhf40ICsbWD3EERB0enIEaClIZSutUfmTwaUE7QFSDgUzpT/WLxy PYoj71W91JJG8mN7i5km4YuR1Jm3sV0B4eRFBspQdWBZIz7UdRrPT6eZ6jzKwC3kTw F3TtjRy1QLsccXaW2JZMw4PJEiWWLn9Ds7uwgfCI= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2f.google.com (mail-oa1-x2f.google.com [IPv6:2001:4860:4864:20::2f]) by sourceware.org (Postfix) with ESMTPS id 73A4A38582A1 for ; Wed, 11 Jan 2023 20:46:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73A4A38582A1 Received: by mail-oa1-x2f.google.com with SMTP id 586e51a60fabf-15085b8a2f7so16839833fac.2 for ; Wed, 11 Jan 2023 12:46:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gUIw/EMim5UVtbqvtb/jUTtqQzqUqrL8CgehT2lyTU8=; b=iktQD+7ix+dcN29dZ33N1S4g+lx2rR91qTn1WHJ8Bhhw6i6/gk8VuKM9FbUPNlN0nc Ij0m/qYw2q/SeE6ayuPLqr2RU6BL57rK1SIOlm+4AFcVLgxwB75eqDaQ+GORFCPV51e6 ny1EafmgV4cfDc1QYrUj796Y0SbiO2Emp2XWs+rCQLUV41nQW1a+RNz5FGrmfvM7HP6I dnqAZc9mTewW2ZDwAuehV3BiNS99FTwfH2nH3tx6xN7cpCx8Wv3sv4Bs4bUJPAc/6kEU R0nvu+rwU9PxsgZfALhTVYQV+0j4ACUP0U4kB0QQ9hNVlmVKiU0mhsDTNgYQ8QfsKt/s lolw== X-Gm-Message-State: AFqh2koXcDNnkm6o8O0bwvA3ipe/Jdg4mX6S8X07DAaVAPSunVkGdYuu U1K/LKmpP8OZKLYTOAvkAak8+JPs/iI8UNaS4qQ= X-Received: by 2002:a05:6870:e886:b0:144:ae9f:1b68 with SMTP id q6-20020a056870e88600b00144ae9f1b68mr41959566oan.38.1673469985515; Wed, 11 Jan 2023 12:46:25 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:24 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 10/17] string: Improve generic memchr Date: Wed, 11 Jan 2023 17:45:51 -0300 Message-Id: <20230111204558.2402155-11-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Reads first word unaligned and use string-maskoff function to remove unwanted data. This strategy follow arch-specific optimization used on aarch64 and powerpc. - Use string-fz{b,i} and string-opthr functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/memchr.c | 178 ++++++------------ .../powerpc32/power4/multiarch/memchr-ppc32.c | 14 +- .../powerpc64/multiarch/memchr-ppc64.c | 9 +- 3 files changed, 58 insertions(+), 143 deletions(-) diff --git a/string/memchr.c b/string/memchr.c index f800d47dce..63ae7b932d 100644 --- a/string/memchr.c +++ b/string/memchr.c @@ -1,10 +1,6 @@ -/* Copyright (C) 1991-2023 Free Software Foundation, Inc. +/* Scan memory for a character. Generic version + Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -20,143 +16,75 @@ License along with the GNU C Library; if not, see . */ -#ifndef _LIBC -# include -#endif - +#include +#include +#include +#include +#include +#include #include -#include +#undef memchr -#include - -#undef __memchr -#ifdef _LIBC -# undef memchr +#ifdef MEMCHR +# define __memchr MEMCHR #endif -#ifndef weak_alias -# define __memchr memchr -#endif - -#ifndef MEMCHR -# define MEMCHR __memchr -#endif +static inline const char * +sadd (uintptr_t x, uintptr_t y) +{ + uintptr_t ret = INT_ADD_OVERFLOW (x, y) ? (uintptr_t)-1 : x + y; + return (const char *)ret; +} /* Search no more than N bytes of S for C. */ void * -MEMCHR (void const *s, int c_in, size_t n) +__memchr (void const *s, int c_in, size_t n) { - /* On 32-bit hardware, choosing longword to be a 32-bit unsigned - long instead of a 64-bit uintmax_t tends to give better - performance. On 64-bit hardware, unsigned long is generally 64 - bits already. Change this typedef to experiment with - performance. */ - typedef unsigned long int longword; - - const unsigned char *char_ptr; - const longword *longword_ptr; - longword repeated_one; - longword repeated_c; - unsigned char c; - - c = (unsigned char) c_in; - - /* Handle the first few bytes by reading one byte at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s; - n > 0 && (size_t) char_ptr % sizeof (longword) != 0; - --n, ++char_ptr) - if (*char_ptr == c) - return (void *) char_ptr; - - longword_ptr = (const longword *) char_ptr; - - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to any size longwords. */ - - /* Compute auxiliary longword values: - repeated_one is a value which has a 1 in every byte. - repeated_c has c in every byte. */ - repeated_one = 0x01010101; - repeated_c = c | (c << 8); - repeated_c |= repeated_c << 16; - if (0xffffffffU < (longword) -1) + if (__glibc_unlikely (n == 0)) + return NULL; + + /* Read the first word, but munge it so that bytes before the array + will not match goal. */ + const op_t *word_ptr = word_containing (s); + uintptr_t s_int = (uintptr_t) s; + + op_t word = *word_ptr; + op_t repeated_c = repeat_bytes (c_in); + /* Compute the address of the last byte taking in consideration possible + overflow. */ + const char *lbyte = sadd (s_int, n - 1); + /* And also the address of the word containing the last byte. */ + const op_t *lword = word_containing (lbyte); + + find_t mask = shift_find (find_eq_all (word, repeated_c), s_int); + if (mask != 0) { - repeated_one |= repeated_one << 31 << 1; - repeated_c |= repeated_c << 31 << 1; - if (8 < sizeof (longword)) - { - size_t i; - - for (i = 64; i < sizeof (longword) * 8; i *= 2) - { - repeated_one |= repeated_one << i; - repeated_c |= repeated_c << i; - } - } + char *ret = (char *) s + index_first (mask); + return (ret <= lbyte) ? ret : NULL; } + if (word_ptr == lword) + return NULL; - /* Instead of the traditional loop which tests each byte, we will test a - longword at a time. The tricky part is testing if *any of the four* - bytes in the longword in question are equal to c. We first use an xor - with repeated_c. This reduces the task to testing whether *any of the - four* bytes in longword1 is zero. - - We compute tmp = - ((longword1 - repeated_one) & ~longword1) & (repeated_one << 7). - That is, we perform the following operations: - 1. Subtract repeated_one. - 2. & ~longword1. - 3. & a mask consisting of 0x80 in every byte. - Consider what happens in each byte: - - If a byte of longword1 is zero, step 1 and 2 transform it into 0xff, - and step 3 transforms it into 0x80. A carry can also be propagated - to more significant bytes. - - If a byte of longword1 is nonzero, let its lowest 1 bit be at - position k (0 <= k <= 7); so the lowest k bits are 0. After step 1, - the byte ends in a single bit of value 0 and k bits of value 1. - After step 2, the result is just k bits of value 1: 2^k - 1. After - step 3, the result is 0. And no carry is produced. - So, if longword1 has only non-zero bytes, tmp is zero. - Whereas if longword1 has a zero byte, call j the position of the least - significant zero byte. Then the result has a zero at positions 0, ..., - j-1 and a 0x80 at position j. We cannot predict the result at the more - significant bytes (positions j+1..3), but it does not matter since we - already have a non-zero bit at position 8*j+7. - - So, the test whether any byte in longword1 is zero is equivalent to - testing whether tmp is nonzero. */ - - while (n >= sizeof (longword)) + word = *++word_ptr; + while (word_ptr != lword) { - longword longword1 = *longword_ptr ^ repeated_c; - - if ((((longword1 - repeated_one) & ~longword1) - & (repeated_one << 7)) != 0) - break; - longword_ptr++; - n -= sizeof (longword); + if (has_eq (word, repeated_c)) + return (char *) word_ptr + index_first_eq (word, repeated_c); + word = *++word_ptr; } - char_ptr = (const unsigned char *) longword_ptr; - - /* At this point, we know that either n < sizeof (longword), or one of the - sizeof (longword) bytes starting at char_ptr is == c. On little-endian - machines, we could determine the first such byte without any further - memory accesses, just by looking at the tmp result from the last loop - iteration. But this does not work on big-endian machines. Choose code - that works in both cases. */ - - for (; n > 0; --n, ++char_ptr) + if (has_eq (word, repeated_c)) { - if (*char_ptr == c) - return (void *) char_ptr; + /* We found a match, but it might be in a byte past the end of the + array. */ + char *ret = (char *) word_ptr + index_first_eq (word, repeated_c); + if (ret <= lbyte) + return ret; } - return NULL; } -#ifdef weak_alias +#ifndef MEMCHR weak_alias (__memchr, memchr) -#endif libc_hidden_builtin_def (memchr) +#endif diff --git a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c index 39ff84f3f3..a78585650f 100644 --- a/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c +++ b/sysdeps/powerpc/powerpc32/power4/multiarch/memchr-ppc32.c @@ -18,17 +18,11 @@ #include -#define MEMCHR __memchr_ppc +extern __typeof (memchr) __memchr_ppc attribute_hidden; -#undef weak_alias -#define weak_alias(a, b) +#define MEMCHR __memchr_ppc +#include #ifdef SHARED -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) \ - __hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); +__hidden_ver1(__memchr_ppc, __GI_memchr, __memchr_ppc); #endif - -extern __typeof (memchr) __memchr_ppc attribute_hidden; - -#include diff --git a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c index 8097df709c..49ba5521fe 100644 --- a/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c +++ b/sysdeps/powerpc/powerpc64/multiarch/memchr-ppc64.c @@ -18,14 +18,7 @@ #include -#define MEMCHR __memchr_ppc - -#undef weak_alias -#define weak_alias(a, b) - -# undef libc_hidden_builtin_def -# define libc_hidden_builtin_def(name) - extern __typeof (memchr) __memchr_ppc attribute_hidden; +#define MEMCHR __memchr_ppc #include From patchwork Wed Jan 11 20:45:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641272 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458624pvb; Wed, 11 Jan 2023 12:48:37 -0800 (PST) X-Google-Smtp-Source: AMrXdXvWE+tkmX0Dqb0lt9T47Y3Lgf4Yc9lLcK0F5zFsHLBKqdAMP6xnrBGo79hL7GisbHHlda4/ X-Received: by 2002:a05:6402:528f:b0:47e:eaae:9a69 with SMTP id en15-20020a056402528f00b0047eeaae9a69mr65396608edb.41.1673470116987; Wed, 11 Jan 2023 12:48:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470116; cv=none; d=google.com; s=arc-20160816; b=rDYGWyaRbPRN/EdPMBMr9a/QNHd5uTiSitrqOthfxyckIbrOPFTCIb1D8b6MGmLMMn oP8KM1XMdPD1+ybcqB2fDxix5IYZ6ulisNXYmcZuq6XHuZv1KSlTZrYk8rG5Wqg8NDBl T4CU8rBWe1DgKPWsWznvAEzLbBg/Z6NE0j7XHPK5zy8ZKwKyO2247xJcp/njVhZdkqlO Is4L25RM0kCkjMcpgeDpuytCTgFF9Lz3um7uVcB/cSpy2KFFuDLurUKDm+jnv8r/Iy3z Mmj+3jDs9hiYj7qjQsjR8ge5b8y8EfeYXvyEmJKrAI5nX9tOBvKpB5wL3A8kq7JolZmV k5Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Dr/IK8aLL1j5e/18nce0vz2jgh/9rUL7YWKtVjzXE14=; b=vpV3FLuVYAdWY8uwY7Xs+bUJhddJwCrMOKC7bksiG3hDIeK1RPQncSCwFq++CX5epV dGtE+J5ofxni7tBD+SUtV2orIZ/SQEgaXjvwgN3bfu/tcpAhqEefwaC6G35N421MiSng kM2zKUf3BxSJ4QL03HniWGJzfMbmPee47sXjAXe53n+Z5BV5/Tp/DVUexTjcelMP5rv4 /MDw2lOfkGet07qOMpIspMv/1NZ0C8vPIvvazsZj3cj9avvrGS+y2I2IBTkYTXwv9zbX H5Th6lZjpqJESvW1hlHl8sFZIT016SQActkKuIO+aOnLHuF/Y3TVe//FsrsNT9aXSw+Y HcKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=DNuVlGyS; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id i2-20020a0564020f0200b00499bebe0ff8si7359785eda.622.2023.01.11.12.48.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:48:36 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=DNuVlGyS; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A4E9B3898386 for ; Wed, 11 Jan 2023 20:48:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A4E9B3898386 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470115; bh=Dr/IK8aLL1j5e/18nce0vz2jgh/9rUL7YWKtVjzXE14=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=DNuVlGySskQlhvXas2kukomcVWjQ8H53tXJFQVxepLzsGttToNr2UHwMv0AVgz7XO Tj2xaw7b+eJE3qX5XzliAte1j2FdiJQNJMC7vZpD2q1pbo6gGwfGKZGCUQntZLKceo vZVIwIqwjP8eMy5s/TsCZWvTLgS89Eu0Ww51X60A= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id EA8D4385483F for ; Wed, 11 Jan 2023 20:46:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EA8D4385483F Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-142b72a728fso16802300fac.9 for ; Wed, 11 Jan 2023 12:46:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dr/IK8aLL1j5e/18nce0vz2jgh/9rUL7YWKtVjzXE14=; b=qChNG8skEuSsxUQjpnvL9SpLfXmzrbqmobY10/DndzLy/s5zsYrFp9OD83urdo+daI Azkw6xI80v89DTFuUPDZxqtFK8i+2DT1ZCzsFmW26RQh+DSHl8OrsTDN8cJo9rzjavMl luh/LgMuTjcPzeiraeBPL3EiYrvbF1xCocB7NDdH4Wy7xbRe7yas+s6mDUXP2ZbNik5T XsB6Wc+eFF3bdf4Lc3X2LRfEUXIPZ9X7O1ll4SAUUFT0bfIywQAGzVoiV1FvdoQIiCEj Nc6DHLHwAmPJnWXgkPPOy4E94iClmqG0UV6jeswE5Mapfzk2aD5an6SQpVdZlYMUokj6 1izA== X-Gm-Message-State: AFqh2kr6lfLcOpkg4f0LK295rdsf/ki+okO0DXUxGXQPWp/BJFlCQMh4 nDDOgx8s4FiCTAAc7Q/GhCigAF7I32VAyhMnU+A= X-Received: by 2002:a05:6870:d1c9:b0:158:a50:d7c4 with SMTP id b9-20020a056870d1c900b001580a50d7c4mr9679105oac.57.1673469987573; Wed, 11 Jan 2023 12:46:27 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:26 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 11/17] string: Improve generic memrchr Date: Wed, 11 Jan 2023 17:45:52 -0300 Message-Id: <20230111204558.2402155-12-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto New algorithm have the following key differences: - Use string-fz{b,i} functions. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson --- string/memrchr.c | 190 ++++++++--------------------------------------- 1 file changed, 33 insertions(+), 157 deletions(-) diff --git a/string/memrchr.c b/string/memrchr.c index 18b20ff76a..54647d47c2 100644 --- a/string/memrchr.c +++ b/string/memrchr.c @@ -1,11 +1,6 @@ /* memrchr -- find the last occurrence of a byte in a memory block Copyright (C) 1991-2023 Free Software Foundation, Inc. This file is part of the GNU C Library. - Based on strlen implementation by Torbjorn Granlund (tege@sics.se), - with help from Dan Sahlin (dan@sics.se) and - commentary by Jim Blandy (jimb@ai.mit.edu); - adaptation to memchr suggested by Dick Karpinski (dick@cca.ucsf.edu), - and implemented by Roland McGrath (roland@ai.mit.edu). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public @@ -21,177 +16,58 @@ License along with the GNU C Library; if not, see . */ -#include - -#ifdef HAVE_CONFIG_H -# include -#endif - -#if defined _LIBC -# include -# include -#endif - -#if defined HAVE_LIMITS_H || defined _LIBC -# include -#endif - -#define LONG_MAX_32_BITS 2147483647 - -#ifndef LONG_MAX -# define LONG_MAX LONG_MAX_32_BITS -#endif - -#include +#include +#include +#include +#include +#include #undef __memrchr #undef memrchr -#ifndef weak_alias -# define __memrchr memrchr +#ifdef MEMRCHR +# define __memrchr MEMRCHR #endif -/* Search no more than N bytes of S for C. */ void * -#ifndef MEMRCHR -__memrchr -#else -MEMRCHR -#endif - (const void *s, int c_in, size_t n) +__memrchr (const void *s, int c_in, size_t n) { - const unsigned char *char_ptr; - const unsigned long int *longword_ptr; - unsigned long int longword, magic_bits, charmask; - unsigned char c; - - c = (unsigned char) c_in; - /* Handle the last few characters by reading one character at a time. - Do this until CHAR_PTR is aligned on a longword boundary. */ - for (char_ptr = (const unsigned char *) s + n; - n > 0 && ((unsigned long int) char_ptr - & (sizeof (longword) - 1)) != 0; - --n) - if (*--char_ptr == c) + Do this until CHAR_PTR is aligned on a word boundary, or + the entirety of small inputs. */ + const unsigned char *char_ptr = (const unsigned char *) (s + n); + size_t align = (uintptr_t) char_ptr % sizeof (op_t); + if (n < OP_T_THRES || align > n) + align = n; + + for (size_t i = 0; i < align; ++i) + if (*--char_ptr == c_in) return (void *) char_ptr; - /* All these elucidatory comments refer to 4-byte longwords, - but the theory applies equally well to 8-byte longwords. */ + const op_t *word_ptr = (const op_t *) char_ptr; + n -= align; + if (__glibc_unlikely (n == 0)) + return NULL; - longword_ptr = (const unsigned long int *) char_ptr; + /* Compute the address of the word containing the initial byte. */ + const op_t *lword = word_containing (s); - /* Bits 31, 24, 16, and 8 of this number are zero. Call these bits - the "holes." Note that there is a hole just to the left of - each byte, with an extra at the end: - - bits: 01111110 11111110 11111110 11111111 - bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD - - The 1-bits make sure that carries propagate to the next 0-bit. - The 0-bits provide holes for carries to fall into. */ - magic_bits = -1; - magic_bits = magic_bits / 0xff * 0xfe << 1 >> 1 | 1; - - /* Set up a longword, each of whose bytes is C. */ - charmask = c | (c << 8); - charmask |= charmask << 16; -#if LONG_MAX > LONG_MAX_32_BITS - charmask |= charmask << 32; -#endif + /* Set up a word, each of whose bytes is C. */ + op_t repeated_c = repeat_bytes (c_in); - /* Instead of the traditional loop which tests each character, - we will test a longword at a time. The tricky part is testing - if *any of the four* bytes in the longword in question are zero. */ - while (n >= sizeof (longword)) + while (word_ptr != lword) { - /* We tentatively exit the loop if adding MAGIC_BITS to - LONGWORD fails to change any of the hole bits of LONGWORD. - - 1) Is this safe? Will it catch all the zero bytes? - Suppose there is a byte with all zeros. Any carry bits - propagating from its left will fall into the hole at its - least significant bit and stop. Since there will be no - carry from its most significant bit, the LSB of the - byte to the left will be unchanged, and the zero will be - detected. - - 2) Is this worthwhile? Will it ignore everything except - zero bytes? Suppose every byte of LONGWORD has a bit set - somewhere. There will be a carry into bit 8. If bit 8 - is set, this will carry into bit 16. If bit 8 is clear, - one of bits 9-15 must be set, so there will be a carry - into bit 16. Similarly, there will be a carry into bit - 24. If one of bits 24-30 is set, there will be a carry - into bit 31, so all of the hole bits will be changed. - - The one misfire occurs when bits 24-30 are clear and bit - 31 is set; in this case, the hole at bit 31 is not - changed. If we had access to the processor carry flag, - we could close this loophole by putting the fourth hole - at bit 32! - - So it ignores everything except 128's, when they're aligned - properly. - - 3) But wait! Aren't we looking for C, not zero? - Good point. So what we do is XOR LONGWORD with a longword, - each of whose bytes is C. This turns each byte that is C - into a zero. */ - - longword = *--longword_ptr ^ charmask; - - /* Add MAGIC_BITS to LONGWORD. */ - if ((((longword + magic_bits) - - /* Set those bits that were unchanged by the addition. */ - ^ ~longword) - - /* Look at only the hole bits. If any of the hole bits - are unchanged, most likely one of the bytes was a - zero. */ - & ~magic_bits) != 0) + op_t word = *--word_ptr; + if (has_eq (word, repeated_c)) { - /* Which of the bytes was C? If none of them were, it was - a misfire; continue the search. */ - - const unsigned char *cp = (const unsigned char *) longword_ptr; - -#if LONG_MAX > 2147483647 - if (cp[7] == c) - return (void *) &cp[7]; - if (cp[6] == c) - return (void *) &cp[6]; - if (cp[5] == c) - return (void *) &cp[5]; - if (cp[4] == c) - return (void *) &cp[4]; -#endif - if (cp[3] == c) - return (void *) &cp[3]; - if (cp[2] == c) - return (void *) &cp[2]; - if (cp[1] == c) - return (void *) &cp[1]; - if (cp[0] == c) - return (void *) cp; + /* We found a match, but it might be in a byte past the start + of the array. */ + char *ret = (char *) word_ptr + index_last_eq (word, repeated_c); + return ret >= (char *) s ? ret : NULL; } - - n -= sizeof (longword); } - - char_ptr = (const unsigned char *) longword_ptr; - - while (n-- > 0) - { - if (*--char_ptr == c) - return (void *) char_ptr; - } - - return 0; + return NULL; } #ifndef MEMRCHR -# ifdef weak_alias weak_alias (__memrchr, memrchr) -# endif #endif From patchwork Wed Jan 11 20:45:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641276 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458894pvb; Wed, 11 Jan 2023 12:49:30 -0800 (PST) X-Google-Smtp-Source: AMrXdXvafN25at6tdw644MoYGBgHhmH21zOZzD/77gkYtJvg9D1ZB5tG+YKqXZLkAwXUlTNfmbCz X-Received: by 2002:a05:6402:401c:b0:48e:94ec:b7ac with SMTP id d28-20020a056402401c00b0048e94ecb7acmr33018460eda.7.1673470169959; Wed, 11 Jan 2023 12:49:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470169; cv=none; d=google.com; s=arc-20160816; b=hqej5UQT5XBr5juAMNNc8ickxV7akfuIjOo2uD4G1LFBedY3G39iIYJ/xtDWB7CElo 1qcOZ/DZbX3I0RdSZh76MLuTiSvM37Ro0EiHIWyG5WFbS8h9KX6xYT+m8nL6edhHnXEt uEH3wrGcfZYBEfNuOBbub4mLV5IO9BdagKc/MEpRKT6z9ZhwQtfuZfcS6Se7GtUVKKLP F8XejR4PAXvP8y6qssSTejpNxEDw67DvcnCzMf2Iztx9TI3s4pFBPXMz6Vcr21qTE4G8 ZeKj2LmuSyCNcP8NjK8OPlXhKFw4AMjDB0X3PEZQWx5bY/TE+Q41pGISPBhw5LrLFp+c Zrww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; b=ntgFb2f4r/RvGxDDajgZiku44lG1xuRP7fSSl+VgLA0zAMNTLLk5nORyvhPxt+YHC6 0rLA1iSA0Lcue9C0JcT14rRmdZKcrLc+iwWfytsSkNRd48mZpDIuI8iejXMdG6JSa+KD 9y2fsItZxaQpqbi10Xwl025fIJe0vRkRSGEk3PMOlQecLJNPdEx1PlRe48gEY65o607/ oBGcM/PuOZq0kvb83RbvUUq3yo6Pi9LUlvSwBh1/bB2a67kHTfvWu7qLp50jFgTJ/Q+y OCrJ6Se41YpP+ilA1aHB6eFsqdlxXmYWJewwCPUT1+OrhMeZaVnKlEBWE53S4eFGuMcS s9eg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=JmAg7ZGb; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id x21-20020a05640226d500b0046de8e02697si19166790edd.239.2023.01.11.12.49.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:49:29 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=JmAg7ZGb; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 94BAE382FCB4 for ; Wed, 11 Jan 2023 20:49:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 94BAE382FCB4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470168; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=JmAg7ZGb78Jn8tT7Ad+Xpfkvzn0HMQY+9VDUEeG4pqbSz3s8WCAQLy4XkF+SBnfi0 o0W3Elbi8pIQzNPerhUknzgrjCG2PrVzd+cXmNMjcG0fqUsgbpQEZsvI3hcflfhfbW jBTTeWrycXzy9Hrm5sVZ6k0Xub+AEWorEE0a4E9k= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x36.google.com (mail-oa1-x36.google.com [IPv6:2001:4860:4864:20::36]) by sourceware.org (Postfix) with ESMTPS id 87B3A38555B6 for ; Wed, 11 Jan 2023 20:46:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 87B3A38555B6 Received: by mail-oa1-x36.google.com with SMTP id 586e51a60fabf-15eaa587226so2148441fac.8 for ; Wed, 11 Jan 2023 12:46:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uR+tmviJAnIQt3Uy9C/8G52NzH6k6qWZDh0/RH+zyro=; b=FLx9aghkcNrCgtTJTl2Wnoa2BoD0Cl2U7dsUDy4suiXoo4BfELSpdUez2Y/Rj3sjVC uK2dY1sZWKEKTgzN4Mll856NYmFIisjtv00xv9UtzkAkm4/IHHOmZO31glyW/HtPs0qP jggluONKrQTFAZqE+JzZ1Gxly5yrW48jlfNcgGHqfT9lIdRZiRQDNr63aMZrlPdZNb5i Kv6WhRRzDd1pRJGpvh+BZq0L1Rde7Ll3RMwAvvHqiS9MghvzBcECTphERNOOzonEQaYp Lr5RVt4/+V/nnxbj977XAsnAexagC9aDI/elce7y2U23HXrF/uVcdKDVsJkihzj4xqKB fKig== X-Gm-Message-State: AFqh2kqHGc2vuoUA4M8Y/PMWo4YqzeAnpo4HtH0/NK8PfxGgBpYitk8O h3OTS97MHOyEOPQolD4b7JYuLqSkbqFdGYV0xJM= X-Received: by 2002:a05:6871:b08:b0:15e:b2f4:adb7 with SMTP id fq8-20020a0568710b0800b0015eb2f4adb7mr729918oab.9.1673469989396; Wed, 11 Jan 2023 12:46:29 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:28 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v7 12/17] hppa: Add memcopy.h Date: Wed, 11 Jan 2023 17:45:53 -0300 Message-Id: <20230111204558.2402155-13-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson GCC's combine pass cannot merge (x >> c | y << (32 - c)) into a double-word shift unless (1) the subtract is in the same basic block and (2) the result of the subtract is used exactly once. Neither condition is true for any use of MERGE. By forcing the use of a double-word shift, we not only reduce contention on SAR, but also allow the setting of SAR to be hoisted outside of a loop. Checked on hppa-linux-gnu. --- sysdeps/hppa/memcopy.h | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 sysdeps/hppa/memcopy.h diff --git a/sysdeps/hppa/memcopy.h b/sysdeps/hppa/memcopy.h new file mode 100644 index 0000000000..0d4b4ac435 --- /dev/null +++ b/sysdeps/hppa/memcopy.h @@ -0,0 +1,42 @@ +/* Definitions for memory copy functions, PA-RISC version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +/* Use a single double-word shift instead of two shifts and an ior. + If the uses of MERGE were close to the computation of shl/shr, + the compiler might have been able to create this itself. + But instead that computation is well separated. + + Using an inline function instead of a macro is the easiest way + to ensure that the types are correct. */ + +#undef MERGE + +static __always_inline op_t +MERGE (op_t w0, int shl, op_t w1, int shr) +{ + _Static_assert (OPSIZ == 4 || OPSIZ == 8, "Invalid OPSIZE"); + + op_t res; + if (OPSIZ == 4) + asm ("shrpw %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + else if (OPSIZ == 8) + asm ("shrpd %1,%2,%%sar,%0" : "=r"(res) : "r"(w0), "r"(w1), "q"(shr)); + return res; +} From patchwork Wed Jan 11 20:45:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641271 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458544pvb; Wed, 11 Jan 2023 12:48:24 -0800 (PST) X-Google-Smtp-Source: AMrXdXtQJWxeZIWh8rL1gQ/VylItLhNkItUpz2o4J2KA/sRSAIXhFJIyaZi++VjPV9QZqv52GYNt X-Received: by 2002:a17:906:4d4f:b0:84d:45d9:6bcf with SMTP id b15-20020a1709064d4f00b0084d45d96bcfmr10966332ejv.42.1673470102323; Wed, 11 Jan 2023 12:48:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470102; cv=none; d=google.com; s=arc-20160816; b=hGfZIfV7+Qi9BVFucPm115CTmFbkZZFM3W9DRdDVl9ojLu8spDVUwFNt8tApz2BHf4 52R0yL8yuHuPUn76kcjkkXrlg+4+5kP1N9P0xUlDt66Y1rbM+REekQV8TTWn0xdJdqIl Wf20U63Xb5ZOwl8RTWkc0AlqP7yXH3g++JAiUKEEUXGchvaYVFiapRWlLN3dtW9Ir0Mm t3GGRRIEtOvOOdmrGrCsrXL4dEt01bMuefMgTSzqM8g78iD6DHfB4OGVUZpxegheHfAO y9uTmW4IljWW9GK9lzR7r0yxy3qJ3M/LQwUJjNUfvKBECAc6tA4CSD4v4aV45M8DLDrr FWMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=0iDVrj/oyXP0zRDZohsc718rfToVIizNlyuJXb6jdzs=; b=t1sEZ4X4NmI+bHDflPHoeNibufjAgqnqcmiWF2BuBX8OEv3ctlYo+JdL/99lNzieG8 eu0TWXqCDUaqRWwi1cw0AD6IDPfehS8eu6Gcf1SVySwa+NFMKQEKi2nYPp2cSu0oPcWb 6HKfSS39P58bLfuh24Fh+U9ziXIsk/u6nC61FYT8uh3GJ/tC70vgkwX1uGq3xNreq5Fc nfPMQU5oX78ljlLbtnMQ8LovcmYbkET6w59XTCpXxNQWOCfZF9bqIO/DNcOwuQYYl/mW F+NdNuEvXNkKW4gYubMHxnN1gN0v/no0x1h9Q2Mnhoa5Fhw0CqYtB9vHXCN+AnHnoRpr DtgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=CSaXYj4C; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ht20-20020a170907609400b007aeef4dd9cfsi17910266ejc.908.2023.01.11.12.48.22 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:48:22 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=CSaXYj4C; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0089E382FACC for ; Wed, 11 Jan 2023 20:48:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0089E382FACC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470101; bh=0iDVrj/oyXP0zRDZohsc718rfToVIizNlyuJXb6jdzs=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=CSaXYj4CNvgWvqBhF2tx/ZQ2Dc2PwgaeH8MovZuXom0fgmCV2+GT7T7fs2MmXA2OP 3sIJZSI2Fi/l3olzYpfaLsRz/8W78ugRFpYXpGjY4NCVLvZ4SeoFizhibIwq9P94zK Wv/QEmYewAVngUTuJD1XW49RUUUlXFrcs3q6PEq0= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x36.google.com (mail-oa1-x36.google.com [IPv6:2001:4860:4864:20::36]) by sourceware.org (Postfix) with ESMTPS id 5CD3C38493E2 for ; Wed, 11 Jan 2023 20:46:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5CD3C38493E2 Received: by mail-oa1-x36.google.com with SMTP id 586e51a60fabf-1322d768ba7so16822778fac.5 for ; Wed, 11 Jan 2023 12:46:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0iDVrj/oyXP0zRDZohsc718rfToVIizNlyuJXb6jdzs=; b=rJ+XFsq76H2ya9I/TpboPafWayQzdStDL14nzI1+w+gSP1LAEJrOiSGMnyQ5Tjugh2 cASS7d41aPCYncgUTgCwtW32gjwHlfgxNDQYwlrzydo+1qj+WQnw34iaI5kGRPEDxDo5 ZpG1UU4p7qMDB0lQttG8w/7HXejyuIv/17SBBD6VCurLnHhtGnSFY7QOJlxYNZBMfIyB jUWSEDvgvZmS820hfwjxD6EKKjpqAXoP/T6PnzJkOGHVS25tbiPAgcb5zJiNRJy5fj4H fSAEp9lnezUqpwzDL8pAA0cVZ0BtLCX3qsL/SBNtVAWNdEgsJdPMEhFLc42p3TiO/7hu KD2A== X-Gm-Message-State: AFqh2kqGBBGo0IBsNLRNjgLtkpO9bVQGI7FfppqMNOE+ioc24vuEWxT2 hGzzTvii+/jZ/njBcJubx1ok7DzXb33dZwcofn4= X-Received: by 2002:a05:6871:88c:b0:15b:9290:3905 with SMTP id r12-20020a056871088c00b0015b92903905mr6091560oaq.42.1673469991240; Wed, 11 Jan 2023 12:46:31 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:30 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v7 13/17] hppa: Add string-fzb.h and string-fzi.h Date: Wed, 11 Jan 2023 17:45:54 -0300 Message-Id: <20230111204558.2402155-14-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson Use UXOR,SBZ to test for a zero byte within a word. While we can get semi-decent code out of asm-goto, we would do slightly better with a compiler builtin. For index_zero et al, sequential testing of bytes is less expensive than any tricks that involve a count-leading-zeros insn that we don't have. Checked on hppa-linux-gnu. --- sysdeps/hppa/string-fzb.h | 70 +++++++++++++++++++ sysdeps/hppa/string-fzi.h | 140 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 210 insertions(+) create mode 100644 sysdeps/hppa/string-fzb.h create mode 100644 sysdeps/hppa/string-fzi.h diff --git a/sysdeps/hppa/string-fzb.h b/sysdeps/hppa/string-fzb.h new file mode 100644 index 0000000000..865e548492 --- /dev/null +++ b/sysdeps/hppa/string-fzb.h @@ -0,0 +1,70 @@ +/* Zero byte detection, boolean. HPPA version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + /* It's more useful to expose a control transfer to the compiler + than to expose a proper boolean result. */ + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "b,n %l1" : : "r"(x) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : nbz); + return 1; + nbz: + return 0; +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + _Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + + asm goto ("uxor,sbz %%r0,%0,%%r0\n\t" + "uxor,nbz %0,%1,%%r0\n\t" + "b,n %l2" : : "r"(x1), "r"(x2) : : sbz); + return 0; + sbz: + return 1; +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/hppa/string-fzi.h b/sysdeps/hppa/string-fzi.h new file mode 100644 index 0000000000..3cedfe4e36 --- /dev/null +++ b/sysdeps/hppa/string-fzi.h @@ -0,0 +1,140 @@ +/* string-fzi.h -- zero byte detection; indexes. HPPA version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H 1 + +#include +#include +#include + +_Static_assert (sizeof (op_t) == 4, "64-bit not supported"); + +static __always_inline unsigned int +index_first (find_t c) +{ + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + if (c & 0xff000000) + return 0; + if (c & 0x00ff0000) + return 1; + if (c & 0x0000ff00) + return 2; + return 3; +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the long in memory order. */ +static __always_inline unsigned int +index_first_zero (op_t x) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for byte equality between X1 and X2. */ +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,= %1,23,8,%%r0\n\t" + "extrw,u,<> %2,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,= %1,15,8,%%r0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,= %1,7,8,%%r0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + unsigned int ret; + + /* Since we have no clz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %2,23,8,%%r0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %2,15,8,%%r0\n\t" + "extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %2,7,8,%%r0\n\t" + "extrw,u,<> %1,7,8,%%r0\n\t" + "ldi 0,%0" + : "=r"(ret) : "r"(x1), "r"(x1 ^ x2), "0"(3)); + + return ret; +} + +/* Similarly, but search for the last zero within X. */ +static __always_inline unsigned int +index_last_zero (op_t x) +{ + unsigned int ret; + + /* Since we have no ctz insn, direct tests of the bytes is faster + than loading up the constants to do the masking. */ + asm ("extrw,u,<> %1,15,8,%%r0\n\t" + "ldi 1,%0\n\t" + "extrw,u,<> %1,23,8,%%r0\n\t" + "ldi 2,%0\n\t" + "extrw,u,<> %1,31,8,%%r0\n\t" + "ldi 3,%0" + : "=r"(ret) : "r"(x), "0"(0)); + + return ret; +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* _STRING_FZI_H */ From patchwork Wed Jan 11 20:45:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641274 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458771pvb; Wed, 11 Jan 2023 12:49:03 -0800 (PST) X-Google-Smtp-Source: AMrXdXvdQLQu/eqe9sIHmmuDeO6hJF9SfHthbtd9YY9rFDSIeApoF8bF2y60/a7Kwre15h2DaD11 X-Received: by 2002:a17:906:d0d0:b0:7c1:23f2:5b51 with SMTP id bq16-20020a170906d0d000b007c123f25b51mr69574758ejb.60.1673470143420; Wed, 11 Jan 2023 12:49:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470143; cv=none; d=google.com; s=arc-20160816; b=yVGoD4AYT2S6aXX51Nrh+G7Z+75vKRmCo4u1exnm9H9ES+AVsFEUnZdzrzeGvouWFw PHmDEW8k81XYxaruJ8ty/k2DPg/OArVcvOqSY2Lw2wEVEn9R/G++4G1JyxDEBo0Ea01K x127+sxxhPugE6x/kSc4KOuhD85LLGYHAi1oiDl28FJYeuc1OWA10ttyGiHiq0+TlK7M /NHgsNZvY5x3HMGSb3w6Bt18pwubn2FMeww1nvd9wVLh8FgVc5Q6u9ZHka7uyZcER24T 07cIlpS7EoQ2nZpEppke2ij8PIsSD9X+ZjIwSdmryaUoHCAm82r4bJltHsS2/J6obd+L PvLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=T39bG9WxmOIHZdUT/Y5EaTjMIDwkgAb8+2gMeD4FCX8=; b=KnocvwPmQBTzCwL4cyqbP0Xey+ypOMTo7Q9lZuH0HUeTiPANj5lc97b64k6SqQhYLP OFTxKQIP1y8fTB0iWE3UAKw0xSbJeVvblFLsQihnF4IoxQE1ASRY+hzdmMgHSZfw0dsG lnXC1Hr+KvJTj1vQSaElwGKFNhD0JkEXhOptHoUW9QdJuYU1eqyPUnSw+Oq4eNHuI5Hu +qf3tIAz+ADIZX0Dbl5DlpF02raB5hI788lUkbnBKp/FYxPum+FHjnx7A2fbzVr6iEep VcTVkyWAzl4EKLCHH/M0LfEsHk1ffsfW4Qn/oJlRaUnbG/OpbcsEGyshqn6/lxxZLkQE lzRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=P52cfjnB; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bb6-20020a1709070a0600b0085a483a6fd9si4627303ejc.212.2023.01.11.12.49.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:49:03 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=P52cfjnB; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4554B3898538 for ; Wed, 11 Jan 2023 20:49:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4554B3898538 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470142; bh=T39bG9WxmOIHZdUT/Y5EaTjMIDwkgAb8+2gMeD4FCX8=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=P52cfjnBsv5Ot8PZT1aePKnVi8jpisOxX9dtpup6UopQ8IV5K/cOewZ25DOKQzRbU ozxwbWiGQVqIW140Z8oabqk2nIMyLbTpA5bbf6VLQZgRAfmvSJA+FA8BvomMHRZgSr zr4TRRadsIKMBf7HGfi62dqM7pX+3U0FE6Juyohc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x29.google.com (mail-oa1-x29.google.com [IPv6:2001:4860:4864:20::29]) by sourceware.org (Postfix) with ESMTPS id 7F056385B513 for ; Wed, 11 Jan 2023 20:46:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7F056385B513 Received: by mail-oa1-x29.google.com with SMTP id 586e51a60fabf-15bb8ec196aso5064599fac.3 for ; Wed, 11 Jan 2023 12:46:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T39bG9WxmOIHZdUT/Y5EaTjMIDwkgAb8+2gMeD4FCX8=; b=Bv25kmwesXxzOqjUqu8x+/4tKMI8PfCAX3rpEQeCWF5cATb9nkYBbIH01Dc92wWi9B ft5V7cVXoomhfZg78cTxRqZmjgzeMcD9jHafOXAZpTjrhTHT6fzrnogYiXCVdKGZfsu5 cxF21pTlY7DGa/ioA0AXeACR0CNRzuW7INQ+P5dvWCoNQCr+q1jTO2wVnn6Uhjj01w1L KQKMWLzzUjZUf4HgJzvSOMxLAuA3mwtiENGoZiQZ12mXFCpQvmUFTu5VpP/zS8TGZxgp Q8DUJQGY8mYoAxIgu6r+sf0TIlJBho8E2VqAP1pvrqcA+K1tQZboFCWfilGlhpOQdLrI EWew== X-Gm-Message-State: AFqh2kodgs5o86KLOCvx6bhkzlSe6yywGHSNE6peddj+dHTZcGllO0Nj iI8UQvwHzo7VzaHmHBh1XV77JpHUONwsa/7FESA= X-Received: by 2002:a05:6870:7a12:b0:158:5c11:6a61 with SMTP id hf18-20020a0568707a1200b001585c116a61mr8013093oab.31.1673469993150; Wed, 11 Jan 2023 12:46:33 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:32 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v7 14/17] alpha: Add string-fzb.h and string-fzi.h Date: Wed, 11 Jan 2023 17:45:55 -0300 Message-Id: <20230111204558.2402155-15-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While alpha has the more important string functions in assembly, there are still a few for find the generic routines are used. Use the CMPBGE insn, via the builtin, for testing of zeros. Use a simplified expansion of __builtin_ctz when the insn isn't available. Checked on alpha-linux-gnu. Co-authored-by: Adhemerval Zanella --- sysdeps/alpha/string-fza.h | 55 +++++++++++++++++ sysdeps/alpha/string-fzb.h | 52 ++++++++++++++++ sysdeps/alpha/string-fzi.h | 113 +++++++++++++++++++++++++++++++++++ sysdeps/generic/string-fzi.h | 2 +- 4 files changed, 221 insertions(+), 1 deletion(-) create mode 100644 sysdeps/alpha/string-fza.h create mode 100644 sysdeps/alpha/string-fzb.h create mode 100644 sysdeps/alpha/string-fzi.h diff --git a/sysdeps/alpha/string-fza.h b/sysdeps/alpha/string-fza.h new file mode 100644 index 0000000000..e7706d62c0 --- /dev/null +++ b/sysdeps/alpha/string-fza.h @@ -0,0 +1,55 @@ +/* Basic zero byte detection. Generic C version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include +#include + +/* The CMPBGE creates a bit mask rather than a byte mask. */ +typedef int find_t; + +static __always_inline find_t +find_zero_all (op_t x) +{ + return __builtin_alpha_cmpbge (0, x); +} + +static __always_inline find_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +static __always_inline find_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* Return the mask WORD shifted based on S_INT address value, to ignore + values not presented in the aligned word read. */ +static __always_inline find_t +shift_find (find_t word, uintptr_t s) +{ + return word >> (s % sizeof (op_t)); +} + +#endif /* _STRING_FZA_H */ diff --git a/sysdeps/alpha/string-fzb.h b/sysdeps/alpha/string-fzb.h new file mode 100644 index 0000000000..e3934ba413 --- /dev/null +++ b/sysdeps/alpha/string-fzb.h @@ -0,0 +1,52 @@ +/* Zero byte detection; boolean. Alpha version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZB_H +#define _STRING_FZB_H 1 + +#include +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + return __builtin_alpha_cmpbge (0, x) != 0; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* _STRING_FZB_H */ diff --git a/sysdeps/alpha/string-fzi.h b/sysdeps/alpha/string-fzi.h new file mode 100644 index 0000000000..3f8a32a472 --- /dev/null +++ b/sysdeps/alpha/string-fzi.h @@ -0,0 +1,113 @@ +/* string-fzi.h -- zero byte detection; indices. Alpha version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZI_H +#define _STRING_FZI_H + +#include +#include +#include + +/* Note that since CMPBGE creates a bit mask rather than a byte mask, + we cannot simply provide a target-specific string-fza.h. */ + +/* A subroutine for the index_zero functions. Given a bitmask C, + return the index of the first bit set in memory order. */ +static __always_inline unsigned int +index_first (find_t c) +{ +#ifdef __alpha_cix__ + return __builtin_ctzl (c); +#else + c = c & -c; + return (c & 0xf0 ? 4 : 0) + (c & 0xcc ? 2 : 0) + (c & 0xaa ? 1 : 0); +#endif +} + +/* Similarly, but return the (memory order) index of the last bit + that is non-zero. Note that only the least 8 bits may be nonzero. */ + +static __always_inline unsigned int +index_last_ (unsigned long int x) +{ +#ifdef __alpha_cix__ + return __builtin_clzl (x) ^ 63; +#else + unsigned r = 0; + if (x & 0xf0) + r += 4; + if (x & (0xc << r)) + r += 2; + if (x & (0x2 << r)) + r += 1; + return r; +#endif +} + +/* Given a word X that is known to contain a zero byte, return the + index of the first such within the word in memory order. */ + +static __always_inline unsigned int +index_first_zero (op_t x) +{ + return index_first (__builtin_alpha_cmpbge (0, x)); +} + +/* Similarly, but perform the test for byte equality between X1 and X2. */ + +static __always_inline unsigned int +index_first_eq (op_t x1, op_t x2) +{ + return index_first_zero (x1 ^ x2); +} + +/* Similarly, but perform the search for zero within X1 or + equality between X1 and X2. */ + +static __always_inline unsigned int +index_first_zero_eq (op_t x1, op_t x2) +{ + return index_first (__builtin_alpha_cmpbge (0, x1) + | __builtin_alpha_cmpbge (0, x1 ^ x2)); +} + +/* Similarly, but perform the search for zero within X1 or + inequality between X1 and X2. */ + +static __always_inline unsigned int +index_first_zero_ne (op_t x1, op_t x2) +{ + return index_first (__builtin_alpha_cmpbge (0, x1) + | (__builtin_alpha_cmpbge (0, x1 ^ x2) ^ 0xFF)); +} + +/* Similarly, but search for the last zero within X. */ + +static __always_inline unsigned int +index_last_zero (op_t x) +{ + return index_last_ (__builtin_alpha_cmpbge (0, x)); +} + +static __always_inline unsigned int +index_last_eq (op_t x1, op_t x2) +{ + return index_last_zero (x1 ^ x2); +} + +#endif /* _STRING_FZI_H */ diff --git a/sysdeps/generic/string-fzi.h b/sysdeps/generic/string-fzi.h index b1fd4d34b3..d1d0e96802 100644 --- a/sysdeps/generic/string-fzi.h +++ b/sysdeps/generic/string-fzi.h @@ -45,7 +45,7 @@ ctz (op_t c) the (memory order) index of the first byte (in memory order) that is non-zero. */ static __always_inline unsigned int -index_first (op_t c) +index_first (find_t c) { int r; if (__BYTE_ORDER == __LITTLE_ENDIAN) From patchwork Wed Jan 11 20:45:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641275 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458828pvb; Wed, 11 Jan 2023 12:49:18 -0800 (PST) X-Google-Smtp-Source: AMrXdXviIDklHO01xqqj3WVis71hHGq6T1HTuUSm6Rhllcg2FR6mbiWVG4Zu8Vp3xgaFd3hFZIw4 X-Received: by 2002:a05:6402:2a02:b0:470:44eb:9e58 with SMTP id ey2-20020a0564022a0200b0047044eb9e58mr66391053edb.30.1673470158149; Wed, 11 Jan 2023 12:49:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470158; cv=none; d=google.com; s=arc-20160816; b=SzmhsAOsq0NiTQZmLiAD8VByIsGrNyUzoRhS772yoOAPbliTuM1oV1brIcXtgu4AZt GhDRwVgOmF0Z3sg2DQSmGM4dC18zLDkXId24fMCTmIl29Pj2L1V9GXoHjbZsomrINJEh wBymtmkJJMuO3E+0J9B5BAeI3VjGZBSBXsZuuLA3VHwVIj3LL0dPi3fE7+DS3o9tCElC +RDviH+LMmA+nuJ8nk4yNjgRAnPdwB0WZaqnIhQihR3qmgigMH2m9YPpmVyRJ05kqEk0 +qi0tHZtcQIv0decOqkOVBY3FtN0gvPieGxjcYgSy0m1NpzkITcIoJUsD0EhlYQnG05t rW/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; b=0hBZr/GCsYUsKK4KpcR1Wcl/drZ83uwjL+wC45ljyjKdQezIVXUbYYuOb8eireEjAh B1Hdp6Otj4qRf9I3cc0RQt0Mn+sRelxcksljHkWAU0R4UgyuhaOqsYKybk63MeQ3JVJW sHSAomczvyd6aw7VBXXQ8eN/xZdPvEK1e+BY/9K839h4DnvZJdutavpy4ihL1GsGxGia aKxGbhid6weweZdBb75kWRFQrncUk6fgBWk6/GLmprO3FRe0fLUK/7b9wmfcRvSBTsMf OAWVXMgv9weg3jm0u+fztjnzAyGbg845CaGJkYg3eqkyqgKc294MGnHQ5oUjFDXq9rFM M3Jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=BFK72Vx8; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f26-20020a056402161a00b0046965a49126si14512139edv.528.2023.01.11.12.49.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:49:18 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=BFK72Vx8; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0370F3817FA2 for ; Wed, 11 Jan 2023 20:49:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0370F3817FA2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470157; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=BFK72Vx8VJbkMf349lxBw9hSSSnBR/RFvFiBKBL8B132urXvP8vWLM0eTZ6z7FwgH YsoOYlIj1pnPDGOWip3YSppgzuyZb5GOL8uM4rQmO9JO6+6jOIvtL7UaMZ15+Clmms lbnkb2Bq56L+jnHOLIo7omdIPM9VtX4V4XBDtKJc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2d.google.com (mail-oa1-x2d.google.com [IPv6:2001:4860:4864:20::2d]) by sourceware.org (Postfix) with ESMTPS id ED0B338555A0 for ; Wed, 11 Jan 2023 20:46:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ED0B338555A0 Received: by mail-oa1-x2d.google.com with SMTP id 586e51a60fabf-142b72a728fso16802657fac.9 for ; Wed, 11 Jan 2023 12:46:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4ccsKlaV6iEM5HdbbwpX6FgBr0cjdJfigDL2bOGuIrU=; b=23oA8p8N2F/AfK7ph8YJsYMPyQeJZw9xAQcgQSCL3HGbdkrcZbJRc3oaHswowkBDoI conTcCkstqLWBDuC113OahtUefuiK+h9au+xzQPDcCFXtSEYOljEB9LXV0HsHvXrxIVH PzLgJ/a6Yy5Nhj/CiRHP4YHXXSJI/tw5tswmpdPytIHHLCZDEquf80yZx+sCH62c18Jw cRVNp4fiZ6hcItbVdCWqyH591TsbNshhamS4Z/tRO7cs1OqGGpyniWCk6uC6h/57RdgX LFCVSRUQvL0qmYTnEJXdf6fSWc6hxl9PqCSBa0iVi6N/o5R7dF1A+84ebec+ivOR79C6 glFA== X-Gm-Message-State: AFqh2krYUE6u1ZbbCEEiF/CReoGXzHlWKukfPa1Ets20pBaw0MMLPZkD 0L0nJfO7Oj2+z6TbmWeLddpxx5YaM+NlWBu/SZo= X-Received: by 2002:a05:6870:89aa:b0:150:50f5:f805 with SMTP id f42-20020a05687089aa00b0015050f5f805mr24820498oaq.6.1673469995022; Wed, 11 Jan 2023 12:46:35 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:34 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v7 15/17] arm: Add string-fza.h Date: Wed, 11 Jan 2023 17:45:56 -0300 Message-Id: <20230111204558.2402155-16-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While arm has the more important string functions in assembly, there are still a few generic routines used. Use the UQSUB8 insn for testing of zeros. Checked on armv7-linux-gnueabihf --- sysdeps/arm/armv6t2/string-fza.h | 70 ++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 sysdeps/arm/armv6t2/string-fza.h diff --git a/sysdeps/arm/armv6t2/string-fza.h b/sysdeps/arm/armv6t2/string-fza.h new file mode 100644 index 0000000000..7aa0843325 --- /dev/null +++ b/sysdeps/arm/armv6t2/string-fza.h @@ -0,0 +1,70 @@ +/* Zero byte detection; basics. ARM version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _STRING_FZA_H +#define _STRING_FZA_H 1 + +#include +#include + +/* This function returns at least one bit set within every byte + of X that is zero. */ + +static __always_inline op_t +find_zero_all (op_t x) +{ + /* Use unsigned saturated subtraction from 1 in each byte. + That leaves 1 for every byte that was zero. */ + op_t ret, ones = repeat_bytes (0x01); + asm ("uqsub8 %0,%1,%2" : "=r"(ret) : "r"(ones), "r"(x)); + return ret; +} + +/* Identify bytes that are equal between X1 and X2. */ + +static __always_inline op_t +find_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static __always_inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_zero_all (x1 ^ x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static __always_inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + /* Make use of the fact that we'll already have ONES in a register. */ + op_t ones = repeat_bytes (0x01); + return find_zero_all (x1) | (find_zero_all (x1 ^ x2) ^ ones); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +#define find_zero_low find_zero_all +#define find_eq_low find_eq_all +#define find_zero_eq_low find_zero_eq_all +#define find_zero_ne_low find_zero_ne_all + +#endif /* _STRING_FZA_H */ From patchwork Wed Jan 11 20:45:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641277 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458969pvb; Wed, 11 Jan 2023 12:49:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXvVAZJFJNW147f6I7NblTHYxjpuS1kgARIjbfMSRVWkbgXRq4CgdFluInrPwNq2bGbq8MBp X-Received: by 2002:a17:907:d10f:b0:83f:cbfd:31a9 with SMTP id uy15-20020a170907d10f00b0083fcbfd31a9mr56819500ejc.47.1673470185276; Wed, 11 Jan 2023 12:49:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470185; cv=none; d=google.com; s=arc-20160816; b=I8WXjj9Zftxxs4M1EfLUj8rqhUSFsI2LoFC9bX3bKAiRY4wgLX8V/21RwTm5rNAQF6 wNRS2YBYyQDBSgFAgXdCjIHeKmmAvKOlO7zvEdELkqNu5GH55s+5leZ9qyCbPW1zIkKf 1DZSdQNmluTTck9rkzErnrBQ1qZj8xp6qt0A1u1bq0ytvqajwZHkN+8hBzdTmg4pmRuQ xqOdchKdYuV6KR+89xTZPK+6n7YvjJb72tqy30cPxBCTBE97jW0Q3cienGQL7gHgFQ1u KAa8noj1dCDHUd4qr4R/M1AJHBJTelhsdHTY8HK5EkShIOySVCs0PJRgM/NGUo981JGC pN4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; b=XUNky7VWNSyu/4qyuJc9FtU4uJWU0QIcpQYJuHmXGW+/hTG8A5h4zaUucGVJK5FFRj spHXjehnhthAIL2W1atbPlhSXsNByUmk9X1h7cjy17dB1tb3qnMwdCXfFDCrYhQGEWkF /nN4GFkuRA7SL7FsWXpRsmEHM0MOiT+CY850cE4AsnXYWE9pMBKBwVaenKFX1ZquBrT9 BlDe17EcT7uBRZYcN6b7ppkFl/RUJsBWQhUP05LlUI39W6AnZokcm+Wb2YK023QxICxd QHtV7M86RJXXoFddY+tOvihS/awtu4dN2yVoxm5LqDO9L4YDqHATKztKiXhyoOcr2K9z Ug+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=BOLnKncO; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l11-20020a170906794b00b0084514612c2asi17340772ejo.609.2023.01.11.12.49.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:49:45 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=BOLnKncO; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 203FE3881D29 for ; Wed, 11 Jan 2023 20:49:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 203FE3881D29 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470184; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=BOLnKncOkXAa7ze5mB49h+CXZh7fLTGklEYyYasIywzU+jLdBYPNyLiG4HBbOu9Us IU9Vpg1ki1sIhhpNz7YNmT42e1Pb8X2YrQndBRE6IVSHnVcS6xRwNR8RW2SH4I0uhy YG1p7Xh6cZzKtnK/lnBm44elGjz4INopAla21xzQ= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2c.google.com (mail-oa1-x2c.google.com [IPv6:2001:4860:4864:20::2c]) by sourceware.org (Postfix) with ESMTPS id 297F2385B50C for ; Wed, 11 Jan 2023 20:46:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 297F2385B50C Received: by mail-oa1-x2c.google.com with SMTP id 586e51a60fabf-14455716674so16844156fac.7 for ; Wed, 11 Jan 2023 12:46:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NB6MyQN54L0ZfjYrf0IIqicl8+o3iDBFOSABu07doRU=; b=s2bMCX/DbmMdo1vnDgMQIDeZi0NKzRUmGY3t9rpO/lF550uueiTKKhe41aKb/mAF2d rEco2lHC/YzbLM5Rx4Upwy6nfucExjKNhguYt90HV8deNT/QtZps+QErOPWKxiyZxwPU KxWfaN/AIUDoTq2AEmskmwdEjVYExqL31NzbF6rt5PvOgbmQFNHYDmAlh7jHEOZkybQh o/PPDTxftnOYgGdo7r1iJrDevM6YkCS4xJLHaiA9JCDj0TVs0eL7DVNuRhDS4BdoCHiO m5b9v1ChyLXEArKAAhVVLyeNjGBWKc8DZE0o0l6sGRuu1wyHz8rIQ/bo8Z8TJbckN1rd cP1Q== X-Gm-Message-State: AFqh2kqvlRCsJSZwcUaW9a1JpxGurmOkWufqmhwybxAXOIM+QfUy4flH 6kwJrrdXQRdImFR04luYIzdHQt6niHk/0XNWZv0= X-Received: by 2002:a05:6870:2404:b0:13b:dafe:5756 with SMTP id n4-20020a056870240400b0013bdafe5756mr36961830oap.23.1673469996805; Wed, 11 Jan 2023 12:46:36 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:36 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Subject: [PATCH v7 16/17] powerpc: Add string-fza.h Date: Wed, 11 Jan 2023 17:45:57 -0300 Message-Id: <20230111204558.2402155-17-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Richard Henderson While ppc has the more important string functions in assembly, there are still a few generic routines used. Use the Power 6 CMPB insn for testing of zeros. Checked on powerpc64le-linux-gnu. --- sysdeps/powerpc/string-fza.h | 70 ++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 sysdeps/powerpc/string-fza.h diff --git a/sysdeps/powerpc/string-fza.h b/sysdeps/powerpc/string-fza.h new file mode 100644 index 0000000000..5496c9db4b --- /dev/null +++ b/sysdeps/powerpc/string-fza.h @@ -0,0 +1,70 @@ +/* Zero byte detection; basics. PowerPC version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _POWERPC_STRING_FZA_H +#define _POWERPC_STRING_FZA_H 1 + +/* PowerISA 2.05 (POWER6) provides cmpb instruction. */ +#ifdef _ARCH_PWR6 +# include + +/* This function returns 0xff for each byte that is + equal between X1 and X2. */ + +static __always_inline op_t +find_eq_all (op_t x1, op_t x2) +{ + op_t ret; + asm ("cmpb %0,%1,%2" : "=r"(ret) : "r"(x1), "r"(x2)); + return ret; +} + +/* This function returns 0xff for each byte that is zero in X. */ + +static __always_inline op_t +find_zero_all (op_t x) +{ + return find_eq_all (x, 0); +} + +/* Identify zero bytes in X1 or equality between X1 and X2. */ + +static __always_inline op_t +find_zero_eq_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | find_eq_all (x1, x2); +} + +/* Identify zero bytes in X1 or inequality between X1 and X2. */ + +static __always_inline op_t +find_zero_ne_all (op_t x1, op_t x2) +{ + return find_zero_all (x1) | ~find_eq_all (x1, x2); +} + +/* Define the "inexact" versions in terms of the exact versions. */ +# define find_zero_low find_zero_all +# define find_eq_low find_eq_all +# define find_zero_eq_low find_zero_eq_all +# define find_zero_ne_low find_zero_ne_all +#else +# include +#endif /* _ARCH_PWR6 */ + +#endif /* _POWERPC_STRING_FZA_H */ From patchwork Wed Jan 11 20:45:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 641270 Delivered-To: patch@linaro.org Received: by 2002:a17:522:f3c4:b0:4b4:3859:abed with SMTP id in4csp3458421pvb; Wed, 11 Jan 2023 12:48:03 -0800 (PST) X-Google-Smtp-Source: AMrXdXtrNQ2nR5PItScDJiiMEOrJV6i0AI/OoHcP5iD4OmT3HZsgkSkCE2fMTvC2xI/kKHk/3fCR X-Received: by 2002:aa7:d689:0:b0:461:7ae:c244 with SMTP id d9-20020aa7d689000000b0046107aec244mr64029583edr.35.1673470082889; Wed, 11 Jan 2023 12:48:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673470082; cv=none; d=google.com; s=arc-20160816; b=paS9Bo19I8EQcVKcLZgBjKl60o6AVicAxMWoeW1Lwhex7zTA70YZ4BjzWO8cqHApTJ H0vtjT655LyHIhUnmpeKdsawW1KpvBAygOmegggluZgmhrRn5WWY68xPVSRCIBH7hV9s wUGi2vvbXk6PpCc1t7dzw3kGW6YFAvRMopHnN8xewdEnKeTqXpuHE0yCWFEFRYEx7W8d ImRgnu9wWpG50I1I2HjrMsB99zuc/Xm7XK3STrg5nMAX3Wgx/g/8MlBQssptFH0ev3UN 9SxwBYr/naiAqvYex2pkzpJ6p76oSns2syD/A8DMroQF+o3hSqrZLsQA/sbCtbaRwFwL o84g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; b=kkoraRQyt1zog/pcUsFvJBIgMEh7dgWbaSkNDpMh2aBmNX7Y9f1IhoNv9JexEc+MNr eggpVgLWmONyBVpNiYDv9YxS/rqJzzjWCAZtV7LyvXww/HlpihkjxGnpRxHG/LaYhNSJ tpVJyJYj7UOW62HOhGOHof8HaOcvWHoPVD/5Nsw2YLGypLlUmAMEnc8AZNg/YQGv3n03 0kAt3+syz5HVsDWPKjblxKlu4GEASoavrOG5/UoHp2+b5HIYJUDE9l3zUuzq31RayYM+ PWTjOvnAYxfRJoplimkRsjObHwKKT9LcxQ7TLMbpencdjh3fNPnteGZv54sbO4bY2zOc rBpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=hGpB8CT+; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Return-Path: Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e7-20020a50fb87000000b0048c68e201desi14838600edq.555.2023.01.11.12.48.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:48:02 -0800 (PST) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=hGpB8CT+; spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B9B243881D1A for ; Wed, 11 Jan 2023 20:48:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B9B243881D1A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1673470081; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=hGpB8CT+TJMrFkforo8G4sLU0mi8PCVxmbdkUJJnnbeBCZ7IHpeHExjaEF+Gw1Q1n /GK6dNjoMbQdEwOj8SeOe9N5WeepjJCkGQrtqHL1z8ZxXEqLQacLN2xQbOXmfbRJYl 9qmZobXMI6DGZArjjxl7IwYz2e50UwEvN/SEDwRk= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oa1-x2e.google.com (mail-oa1-x2e.google.com [IPv6:2001:4860:4864:20::2e]) by sourceware.org (Postfix) with ESMTPS id BADAC38493CB for ; Wed, 11 Jan 2023 20:46:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BADAC38493CB Received: by mail-oa1-x2e.google.com with SMTP id 586e51a60fabf-1442977d77dso16827444fac.6 for ; Wed, 11 Jan 2023 12:46:39 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yGPovr38ZU02FWiH7KoXJ0+CqtlDx/oj85V6st1cjgM=; b=eqLZCZrvF+1zxWtSXlK8CMBcmaiMXgVDzA5KYxeEZx4IR6gExM7eYUIxc2RsF4MXXW mNhg+1uy8N7dOSgfqZqPnFdqJ9tUbcqKaBC8fbjMpqe0MH0whzh8YMjLl2Fkbpo30YnS s01qd8FQG8vUKAAcorcfWDyzDhG9EWAsdOTYsqIK28f1ZK6BJOqes/cUsrG6BnHNnyLX aXGL179bkZen7gw5wdTHOzzklNLj7rpQ7l2F8eOikjOxyW1WOm3rR86wgIXZyYvpjkZt MWKE6yH44DewzJKm/aSGiEQZgXTiDoqwI7AmpE6FGEQiZD/7908be6H/m6yrHeeseFya 01oQ== X-Gm-Message-State: AFqh2krBXkdLorHo1RNrWny16yTafX1DxEnPRyG+hpzQf9vDaKnAUAhG d/Z9c1rNCTnvx+Wllz+63N+LfkK+BYDr8ijViSo= X-Received: by 2002:a05:6870:6d0f:b0:15b:94ef:ffcd with SMTP id mw15-20020a0568706d0f00b0015b94efffcdmr5355920oab.6.1673469998885; Wed, 11 Jan 2023 12:46:38 -0800 (PST) Received: from mandiga.. ([2804:1b3:a7c0:a93a:a504:f3f6:dd7b:801]) by smtp.gmail.com with ESMTPSA id kw18-20020a056870ac1200b0014c8b5d54b2sm7990274oab.20.2023.01.11.12.46.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 12:46:38 -0800 (PST) To: libc-alpha@sourceware.org, Noah Goldstein , Richard Henderson Cc: Adhemerval Zanella Netto Subject: [PATCH v7 17/17] sh: Add string-fzb.h Date: Wed, 11 Jan 2023 17:45:58 -0300 Message-Id: <20230111204558.2402155-18-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> References: <20230111204558.2402155-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Sender: "Libc-alpha" From: Adhemerval Zanella Netto Use the SH cmp/str on has_{zero,eq,zero_eq}. Checked on sh4-linux-gnu. --- sysdeps/sh/string-fzb.h | 54 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 sysdeps/sh/string-fzb.h diff --git a/sysdeps/sh/string-fzb.h b/sysdeps/sh/string-fzb.h new file mode 100644 index 0000000000..0ad19b58c9 --- /dev/null +++ b/sysdeps/sh/string-fzb.h @@ -0,0 +1,54 @@ +/* Zero byte detection; boolean. SH4 version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef STRING_FZB_H +#define STRING_FZB_H 1 + +#include +#include + +/* Determine if any byte within X is zero. This is a pure boolean test. */ + +static __always_inline _Bool +has_zero (op_t x) +{ + op_t zero = 0x0, ret; + asm volatile ("cmp/str %1,%2\n" + "movt %0\n" + : "=r" (ret) + : "r" (zero), "r" (x)); + return ret; +} + +/* Likewise, but for byte equality between X1 and X2. */ + +static __always_inline _Bool +has_eq (op_t x1, op_t x2) +{ + return has_zero (x1 ^ x2); +} + +/* Likewise, but for zeros in X1 and equal bytes between X1 and X2. */ + +static __always_inline _Bool +has_zero_eq (op_t x1, op_t x2) +{ + return has_zero (x1) | has_eq (x1, x2); +} + +#endif /* STRING_FZB_H */