From patchwork Fri Mar 15 17:23:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 780526 Delivered-To: patch@linaro.org Received: by 2002:a5d:46c1:0:b0:33e:7753:30bd with SMTP id g1csp319175wrs; Fri, 15 Mar 2024 10:24:03 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXrlvMj7IxprwlBM4kpld7GZZ5cXv5gtFqBvql5ZF1ajgSZBY0fpjrNSyvsm7eZj46csv7H0A6z/jl6qXV3dvVs X-Google-Smtp-Source: AGHT+IE3uICAvtY1Zv/FuAHTIMl4sGOnTcuEXkq3KCLVblLb4s39cy3ZIaBmblp5NRsfJMJWHgS3 X-Received: by 2002:a05:6870:238c:b0:221:9442:4d58 with SMTP id e12-20020a056870238c00b0022194424d58mr5958971oap.28.1710523443242; Fri, 15 Mar 2024 10:24:03 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710523443; cv=pass; d=google.com; s=arc-20160816; b=VRtXBhD2OIceFWTcOWc5W1Q0XcztKIMR52iS8UE5XVCfUI+bS+YYpQsMeBMvmb2wdN E+/s0q6sMJAIbeqqkYpA4zaMlR84d++AVRKoTM34BlY/3Eul7RxKBWEFwnsB8KfnvzjA WIOH1urWZCyWCSiJQR+9nYPOkv+hQkXqoeti61HXWuZQnYqS7OAYqRAqSvYOS2DeMPF5 C5fFs5Mgda3FBwc6gEjvE0dALOpaWceWHYS/LCAnVp+pp2TvX6fEUh4wlvXwDMVSpB9U jEz8DLW6ZbGczanB0y6IpnZhtEhjYZv1VURSX0tZejdjU6oDKxcKI2ELFI7R4tTAuFgU QKiQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=cNzuAmySalVRknsjNs+1rDsmNUyf/PskQEf5zyLK47Q=; fh=eu1Zp32uVuMWPuN8grPjilAvDlGbvX7rAW6asBgdRfs=; b=C+Zlk3gE2mk1IulrilNClCUhVOLEjPxUGcnsdHsVb/SyQUwIeckzG0Htg/dY78AJAk 8fEfkDk20L2dkTqFtcQ40inf+NFLhOBDrUPtNn0/XuXP1o8M1dLF56h+rfRaXBHzISMq C4VH4OxUXkky268acXqYVjmqXRAYL6erJDUB0fDzCJALhfp30Aq48UvWP6U4FB77/o7E E8Klh3HYKiyc78WsGYkylbvfS2X1YTbHHO2IdzPfsq0hHruiZZdih3VvPAmbVt/0ZX4p XIWKFPj5G1PMj/8Oc3LsWlKicMIc4YkJGyjlwIn342Qgu9KyEVMbMrb7QTeyUgzzNaKb gilg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Hh8whlQ9; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id jk5-20020ad45d45000000b00690dc9e6e5fsi3582249qvb.514.2024.03.15.10.24.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Mar 2024 10:24:03 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Hh8whlQ9; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B05263858416 for ; Fri, 15 Mar 2024 17:24:02 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 6DE353858C41 for ; Fri, 15 Mar 2024 17:23:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6DE353858C41 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6DE353858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::636 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710523435; cv=none; b=WrH4NDvPrPUKzMLyuF+PVxEs3mbgxnUn6ds/UXyNIkQ0C2oH1E7Uym7XFSloRG7TlwuxwCGLpXgpX0gKO6EFHnCZrmICA959bPX16nuEuFExQ97qsorzafhn2L2eWGmAR4ThQXXzXygrMz6vG9X7qOBeybbGv0qj0p5Edw8QP5c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710523435; c=relaxed/simple; bh=oJ+OHBWnQyos+tJ1A88fVQ1852DkqRG3maRgDUpric8=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=ww9/xCvqKAhnC11UMnQ/mQ23XLSsP6nxRm/lFUeQ9P1lk4gKEaQ/iyUiz8A0wOdrohOV0oMUu52qvIX+DhBESnD92VsBLEVj7WJXKQqIsLdKREWfuua68ppYFaRZ+Z9KuCVuVLwASsX4o2kTqowOJwZADqfJXy/x8SVj7+YXlYs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1dddad37712so21941055ad.3 for ; Fri, 15 Mar 2024 10:23:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1710523431; x=1711128231; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=cNzuAmySalVRknsjNs+1rDsmNUyf/PskQEf5zyLK47Q=; b=Hh8whlQ9DSuAlSdtiJVmFTqXf6n1OxEhDA7+512RXeLJbHxT2++1v0lW/izEZf8CFe obbPRuRS1EjmoxdrrQZJKHIjnZuVtU13nenDNjZW19nAdI8Vlc0RGJDuSbIB83cdsLUQ wCTZ96vv0z9jRgtBDOq+7hA8GlM/Wj39UL5AiZY7j5dhZgb75XKLfkvHSS3D0EuY1k/9 JAINx8th7e20lLmH0XvGeNeuorzZcEFNGb34u2hdXERlwjPywo9BKPfHpUSnz1Ntbrmi b8TKZ6ePuqBxZwtDy62GXz4PUqgGWb20+uFdO+GLvNJHjrj0shyYbvM8DPW3fZciaDoE flXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710523431; x=1711128231; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cNzuAmySalVRknsjNs+1rDsmNUyf/PskQEf5zyLK47Q=; b=vCmXr5UVklJgWQsfilqnPoKkVXeMOiZyrs8F3UUcdFTaeoVTaASP4x159ghXtXugPj fN/ANnoIaT5qgY1qIql1mauFzxlzyBWgpOSv9INIZpkmzevFJXQxq5gqazG6rN7e8nC4 31mbMn71E4pCcnqbiS2Zy7JjsbehcUJmQLSa3uRaPm38I8w5b1af/S756HXzfpjVYWWU OiZd+NOIo48oHl7BWRFpeMcf5yTTjPVbOF+s6KRvyu7V/KBR8LNNZylJsmjo9o6S4dsp Q2/eNKfUz6qD8WwY6QCwSQ2sbxXJ+w8VjBlWCczpQIxM0exmCi1yZPWQmpdiPgJcKVVR 8h3g== X-Gm-Message-State: AOJu0YxkNQGPRUk3W46lLGbTOAV9FQ25Lq12kSr+uS/us3giteYh23/U tACnseJrXdclhQUD4BoNG9ngMy/OiTOH+HkZthtbeVFI2mZSnlaeQr1Wwk9aFjp8TiCRNCNEHBB 5 X-Received: by 2002:a17:902:f544:b0:1de:fbc2:99f0 with SMTP id h4-20020a170902f54400b001defbc299f0mr1569980plf.2.1710523430734; Fri, 15 Mar 2024 10:23:50 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c2:8dfd:df12:64e1:1306:3ebd]) by smtp.gmail.com with ESMTPSA id d12-20020a170902cecc00b001d91d515dffsm4089518plg.156.2024.03.15.10.23.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Mar 2024 10:23:49 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: goldstein.w.n@gmail.com, amonakov@ispras.ru, "H . J . Lu" Subject: [PATCH v3 0/2] Improve wcsstr Date: Fri, 15 Mar 2024 14:23:44 -0300 Message-Id: <20240315172346.2484542-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-6.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org Different than strstr, wcsstr still uses an O(m*n) algorithm that might be considered a security issue (although BZ 23865 was marked security- since there is no actual application impact). The gnulib recently added a wrapper to fix it [1] and it is used as the base de str-two-way.h implementation. This patch adds a similar implementation, and different than strstr, neither the "shift table" optimization nor the self-adapting filtering check is used because it would result in a too-large shift table (and it also simplifies the implementation bit). The patchset also added a proper tests for wcsstr, based on strstr one. With this fix, and with the removal of the powerpc strcasestr optimization [2], it seems that only x86_64 still provides a non O(m*n) implementation [3]. Noah already gave a +1, so it would be good to have some confirmation that this implementation can really show some quadradic behaviour before propose a removal. [1] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commit;h=9411c5e467cf60f6295b9fed306029f341a0f24f [2] https://sourceware.org/git/?p=glibc.git;a=commit;h=4a76fb1da8b7e7fa472741921f49ef32f81bc0a0 [3] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86_64/multiarch/strstr-avx512.c;h=3ac53accbdde0b400dfd19a2070fbb579aff4177;hb=4a76fb1da8b7e7fa472741921f49ef32f81bc0a0 Changes from v2: * Remove the test repetition. Changes from v1: * Add more tests from gnulib. * Removed unused macros from wcsstr. Adhemerval Zanella (2): wcsmbs: Add test-wcsstr wcsmbs: Ensure wcstr worst-case linear execution time (BZ 23865) string/test-strstr.c | 314 +++++++++++++++++++++++++++++++++++-------- wcsmbs/Makefile | 1 + wcsmbs/test-wcsstr.c | 20 +++ wcsmbs/wcs-two-way.h | 312 ++++++++++++++++++++++++++++++++++++++++++ wcsmbs/wcsstr.c | 103 +++++--------- 5 files changed, 624 insertions(+), 126 deletions(-) create mode 100644 wcsmbs/test-wcsstr.c create mode 100644 wcsmbs/wcs-two-way.h