From patchwork Tue Oct 31 20:09:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 739596 Delivered-To: patch@linaro.org Received: by 2002:a5d:4c47:0:b0:32d:baff:b0ca with SMTP id n7csp1841658wrt; Tue, 31 Oct 2023 13:09:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFzEwwuBMJPk05L+xZnTu8kyhIj40k7Xr5HfmXz3y4v4jp9VAnv5QrPiwyNZ3hIS3ozFRXe X-Received: by 2002:a05:620a:460a:b0:779:cf70:8495 with SMTP id br10-20020a05620a460a00b00779cf708495mr1064294qkb.22.1698782981261; Tue, 31 Oct 2023 13:09:41 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698782981; cv=pass; d=google.com; s=arc-20160816; b=cRJy2HIY9b+HUQiav/ns4JSO7sHstLQ9jzgpefkfw/BZcQ/Ui2O0retaFsfzeVgwHN ysz3huTmGttpSWR73GpuVKoWWLER1IjUt+8NotbEtwwI/KJTJ5dvXWa17W8yAcvRbi3H r4F83WEYbU+ADCrcaztQ/d59aB67bV8wUxxHUs/B3Pnw6wGHnNQEiEhq/Tcwdo78BcTr MjlYsvxcc+rQ6YtQI9Dk/gJSMYqVzFCsjB0CW6x5p3ZgPWMq16yw3nTSWVAOaeHoaKFV C/xi8LOTVc92a8DA19InIX4QCFTbG390Vwldz6cSHNOLPgOgdA/9aPSi2GHjvz2k3ji1 +EWA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=zIauoWevDKiUNM+VEPSV/YJFtLv+N8G4DiKdLq6Kgyw=; fh=pPFR/wQ0k/GuqyNsntNA5LqEpigLoSYvby6BmJvILyA=; b=kQdtbq9gs/bTQo8jPEj4jzPv7DDd+9lv3rWlPS+CHaldaPDLn4wrdMOz8Yc8vKRYB5 8SoDnyyQaINeMFsxwVGT86RlEWbBoKr6SjrhxqP8Yamp5VR8XyGw1G1sJJP91sW64H3r rlCkQZU9ZzlITm4fDtEeiFQ6TpzUDKk4iK2jNhw6W+uq81R8AM4Z2xctSbVWq+eExJIa vgdxUl7MCEHB1NLxfQP69sCaHBJp7gfgigFCvtpkQ11zVHQdtpKDHZvkq3CeSPeaBbHy 66b1RxG8JNwnwRWQaB+YfbsRFcMwyxqCFHuJtKOQURO93xDKNbAAhHhIixEbrvCMunWZ INvQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gOIDZNNB; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ty5-20020a05620a3f4500b00773fa71cbb8si1622692qkn.512.2023.10.31.13.09.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:41 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=gOIDZNNB; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DD4973857357 for ; Tue, 31 Oct 2023 20:09:40 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by sourceware.org (Postfix) with ESMTPS id BA5C73858C52 for ; Tue, 31 Oct 2023 20:09:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BA5C73858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BA5C73858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782974; cv=none; b=aSi+MElv0QWBXeWH+/PCprQmU1IkKMVwTMm7Hu8gf5v75Zt62twgbmuolo8s7hntppywOZMoOlix7IAHuPRn2P81s/WUW5v6Mh8a1HZhzputSCL+5GT2V9kjHNmL554x/e1s1k8iB5h7JOgE2J1tEcfJnxUUubNuDm8GKgetTXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782974; c=relaxed/simple; bh=m3dkwBibuOaRZ+MwwBzHLQsRix84pP21dJ2j7HtC2ss=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=KgQa+JnmwuLmxWuwMi9n4aHlIXMm4+IZ56ykpIwG4x7TaalphK941umndu/8I/d6dtNMVtaLWZavcBeY578JZVBw5PjtFjt7WniMXuYK1TfN8jO+9Lm/Dt3HeEvB+CHHG/Ri503XN2B6Ftz2GediTd3/Y/7HJBKYUPYevet4wUU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-59e88a28b98so2145857b3.1 for ; Tue, 31 Oct 2023 13:09:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1698782971; x=1699387771; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zIauoWevDKiUNM+VEPSV/YJFtLv+N8G4DiKdLq6Kgyw=; b=gOIDZNNB41etjGsFq6lkrEvETKE7oUvf/zsM4OtF3RSF+ymEPfAAj8sAr7nsuWIFXy bDDzmvc4oSxgetcgLwPtfD8TF133au0oBACrSJc0BqtXPEI00jrEpGJwNF0uhD8lW0Es vzr1dnHEWY9UTYIHZCXXdAaV2gyiHSAw7E1c8C7+NEn4ls1QC7mE4jwTYLzagcsuLM/p smklY3beecjyLBh5a5LD5MiQiNLepMxBxhLHDLCTuP6bjOfrWzgqHRsv+Qh/oEGapSt4 Sy6PoZ2p7Hz8yB7sEkxwdexRMwNrXKv7xZX82QEwsHj9dSA7jwWApE3YLLSz0cxywG4Y pOgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698782971; x=1699387771; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zIauoWevDKiUNM+VEPSV/YJFtLv+N8G4DiKdLq6Kgyw=; b=YfzLwd9QLpAtmXOR+dXFGPAqGpKAr9TL6EMrOewgAq+2aq/McG+D46QCBMb1x/dhpc dciH8sVUrg65lzKb4vFN/tUauw4jeS+U/bZkZuQJOBp+ZuBFp1jjPaNoKEHV1IdZnj6C /EkBuqWi0/IBiTLigehboCH/5YB9eJN6yrUP390jG3MRVfo23ZlPfXN9EOLUaSH/I86a FvQOwQLvKNT3kGZhmNz52qQdNhlPa8GAdhtehWlUSGFXn5sdx+oyMr6f7IVF+7qbc5fg Pk7yPFk/ncysorAKuLgnwl99Apa6f935txAF14VuSbxh6BufqhQ3t7cbZKoH36ztazgh Zf7w== X-Gm-Message-State: AOJu0YyKy6y7Fc9/L/A2S10rT8I/DO7q3aDwQT8Slcb4EHW6KI+yyMRc gOr4hNfu+Nbk4rauUU9dHImfezhsU/is3B3FGavKGw== X-Received: by 2002:a81:4c58:0:b0:5a7:b4d1:c4dd with SMTP id z85-20020a814c58000000b005a7b4d1c4ddmr2761716ywa.5.1698782971412; Tue, 31 Oct 2023 13:09:31 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:3d3c:6c87:9be3:8cfc:976d]) by smtp.gmail.com with ESMTPSA id q69-20020a819948000000b005a7fa3ccb32sm1264111ywg.35.2023.10.31.13.09.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:30 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , "H . J . Lu" , Bruce Merry Subject: [PATCH 1/4] elf: Add a way to check if tunable is set (BZ 27069) Date: Tue, 31 Oct 2023 17:09:22 -0300 Message-Id: <20231031200925.3297456-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> References: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org The tunable already keep a field whether it is initialized. To query the default value, it is easier to add a new constant field. The patch adds two new macros, TUNABLE_GET_DEFAULT and TUNABLE_IS_INITIALIZED, where the former get the default value with a signature similar to TUNABLE_GET while the later returns whether the tunable was set by the environment. Checked on x86_64-linux-gnu. --- elf/dl-tunable-types.h | 1 + elf/dl-tunables.c | 40 ++++++++++++++++++++++++++++++++++++++++ elf/dl-tunables.h | 28 ++++++++++++++++++++++++++++ elf/dl-tunables.list | 1 + scripts/gen-tunables.awk | 4 ++-- 5 files changed, 72 insertions(+), 2 deletions(-) diff --git a/elf/dl-tunable-types.h b/elf/dl-tunable-types.h index c88332657e..c41a3b3bdb 100644 --- a/elf/dl-tunable-types.h +++ b/elf/dl-tunable-types.h @@ -61,6 +61,7 @@ struct _tunable { const char name[TUNABLE_NAME_MAX]; /* Internal name of the tunable. */ tunable_type_t type; /* Data type of the tunable. */ + const tunable_val_t def; /* The value. */ tunable_val_t val; /* The value. */ bool initialized; /* Flag to indicate that the tunable is initialized. */ diff --git a/elf/dl-tunables.c b/elf/dl-tunables.c index cae67efa0a..79b4d542a3 100644 --- a/elf/dl-tunables.c +++ b/elf/dl-tunables.c @@ -145,6 +145,13 @@ tunable_initialize (tunable_t *cur, const char *strval) do_tunable_update_val (cur, &val, NULL, NULL); } +bool +__tunable_is_initialized (tunable_id_t id) +{ + return tunable_list[id].initialized; +} +rtld_hidden_def (__tunable_is_initialized) + void __tunable_set_val (tunable_id_t id, tunable_val_t *valp, tunable_num_t *minp, tunable_num_t *maxp) @@ -388,6 +395,39 @@ __tunables_print (void) } } +void +__tunable_get_default (tunable_id_t id, void *valp) +{ + tunable_t *cur = &tunable_list[id]; + + switch (cur->type.type_code) + { + case TUNABLE_TYPE_UINT_64: + { + *((uint64_t *) valp) = (uint64_t) cur->def.numval; + break; + } + case TUNABLE_TYPE_INT_32: + { + *((int32_t *) valp) = (int32_t) cur->def.numval; + break; + } + case TUNABLE_TYPE_SIZE_T: + { + *((size_t *) valp) = (size_t) cur->def.numval; + break; + } + case TUNABLE_TYPE_STRING: + { + *((const char **)valp) = cur->def.strval; + break; + } + default: + __builtin_unreachable (); + } +} +rtld_hidden_def (__tunable_get_default) + /* Set the tunable value. This is called by the module that the tunable exists in. */ void diff --git a/elf/dl-tunables.h b/elf/dl-tunables.h index 45c191e021..0df4dde24e 100644 --- a/elf/dl-tunables.h +++ b/elf/dl-tunables.h @@ -45,18 +45,26 @@ typedef void (*tunable_callback_t) (tunable_val_t *); extern void __tunables_init (char **); extern void __tunables_print (void); +extern bool __tunable_is_initialized (tunable_id_t); extern void __tunable_get_val (tunable_id_t, void *, tunable_callback_t); extern void __tunable_set_val (tunable_id_t, tunable_val_t *, tunable_num_t *, tunable_num_t *); +extern void __tunable_get_default (tunable_id_t id, void *valp); rtld_hidden_proto (__tunables_init) rtld_hidden_proto (__tunables_print) +rtld_hidden_proto (__tunable_is_initialized) rtld_hidden_proto (__tunable_get_val) rtld_hidden_proto (__tunable_set_val) +rtld_hidden_proto (__tunable_get_default) /* Define TUNABLE_GET and TUNABLE_SET in short form if TOP_NAMESPACE and TUNABLE_NAMESPACE are defined. This is useful shorthand to get and set tunables within a module. */ #if defined TOP_NAMESPACE && defined TUNABLE_NAMESPACE +# define TUNABLE_IS_INITIALIZED(__id) \ + TUNABLE_IS_INITIALIZED_FULL(TOP_NAMESPACE, TUNABLE_NAMESPACE, __id) +# define TUNABLE_GET_DEFAULT(__id, __type) \ + TUNABLE_GET_DEFAULT_FULL(TOP_NAMESPACE, TUNABLE_NAMESPACE,__id, __type) # define TUNABLE_GET(__id, __type, __cb) \ TUNABLE_GET_FULL (TOP_NAMESPACE, TUNABLE_NAMESPACE, __id, __type, __cb) # define TUNABLE_SET(__id, __val) \ @@ -65,6 +73,10 @@ rtld_hidden_proto (__tunable_set_val) TUNABLE_SET_WITH_BOUNDS_FULL (TOP_NAMESPACE, TUNABLE_NAMESPACE, __id, \ __val, __min, __max) #else +# define TUNABLE_IS_INITIALIZED(__top, __ns, __id) \ + TUNABLE_IS_INITIALIZED_FULL(__top, __ns, __id) +# define TUNABLE_GET_DEFAULT(__top, __ns, __type) \ + TUNABLE_GET_DEFAULT_FULL(__top, __ns, __id, __type) # define TUNABLE_GET(__top, __ns, __id, __type, __cb) \ TUNABLE_GET_FULL (__top, __ns, __id, __type, __cb) # define TUNABLE_SET(__top, __ns, __id, __val) \ @@ -73,6 +85,22 @@ rtld_hidden_proto (__tunable_set_val) TUNABLE_SET_WITH_BOUNDS_FULL (__top, __ns, __id, __val, __min, __max) #endif +/* Return whether the tunable was initialized by the environment variable. */ +#define TUNABLE_IS_INITIALIZED_FULL(__top, __ns, __id) \ +({ \ + tunable_id_t id = TUNABLE_ENUM_NAME (__top, __ns, __id); \ + __tunable_is_initialized (id); \ +}) + +/* Return the default value of the tunable. */ +#define TUNABLE_GET_DEFAULT_FULL(__top, __ns, __id, __type) \ +({ \ + tunable_id_t id = TUNABLE_ENUM_NAME (__top, __ns, __id); \ + __type __ret; \ + __tunable_get_default (id, &__ret); \ + __ret; \ +}) + /* Get and return a tunable value. If the tunable was set externally and __CB is defined then call __CB before returning the value. */ #define TUNABLE_GET_FULL(__top, __ns, __id, __type, __cb) \ diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 695ba7192e..5bb858b1d8 100644 --- a/elf/dl-tunables.list +++ b/elf/dl-tunables.list @@ -20,6 +20,7 @@ # type: Defaults to STRING # minval: Optional minimum acceptable value # maxval: Optional maximum acceptable value +# default: Optional default value (if not specified it will be 0 or "") # env_alias: An alias environment variable # security_level: Specify security level of the tunable for AT_SECURE binaries. # Valid values are: diff --git a/scripts/gen-tunables.awk b/scripts/gen-tunables.awk index d6de100df0..9726b05217 100644 --- a/scripts/gen-tunables.awk +++ b/scripts/gen-tunables.awk @@ -177,8 +177,8 @@ END { n = indices[2]; m = indices[3]; printf (" {TUNABLE_NAME_S(%s, %s, %s)", t, n, m) - printf (", {TUNABLE_TYPE_%s, %s, %s}, {%s}, false, TUNABLE_SECLEVEL_%s, %s},\n", - types[t,n,m], minvals[t,n,m], maxvals[t,n,m], + printf (", {TUNABLE_TYPE_%s, %s, %s}, {%s}, {%s}, false, TUNABLE_SECLEVEL_%s, %s},\n", + types[t,n,m], minvals[t,n,m], maxvals[t,n,m], default_val[t,n,m], default_val[t,n,m], security_level[t,n,m], env_alias[t,n,m]); } print "};" From patchwork Tue Oct 31 20:09:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 739597 Delivered-To: patch@linaro.org Received: by 2002:a5d:4c47:0:b0:32d:baff:b0ca with SMTP id n7csp1841681wrt; Tue, 31 Oct 2023 13:09:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGteY2nGI2qznM1bOQndpIH9orEKiqvz1HJFp/e8vSbqiwg9oxPCkVArQogM5DdKOyxpRnE X-Received: by 2002:ac8:5991:0:b0:419:65f8:1cb8 with SMTP id e17-20020ac85991000000b0041965f81cb8mr6104189qte.10.1698782985430; Tue, 31 Oct 2023 13:09:45 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698782985; cv=pass; d=google.com; s=arc-20160816; b=Sh57YNH809SJG7Mdq2eN7pIJ+fllj16ctL2JR+LRaDJBJTPw8CpLn3UBpu4fXmOXPe myL/41Hy1WNVj8V8fafm/vHwTYw/rAvsMB3481OCEupIWxOA4aAaUKr/6xYgJoZnySUe A9JT9e0va5wPkhMG27sX2taYLIxXNu+Ae1rq7PMvskA/tvS0DEXVZkZwGfgpN8OZKaaH 1siAlvh0jPTInUNYYzxjLn1Lr33CrKaWrPAwl8mD1ujwXULqMUUg2SZpbbtREibXomCc mD+Kg5cmJ+LhIlZEHRrw9lffPn8iQkD6R21eEI1YsWr1216O7lLVyrca9ik398nI5wwU j3yQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=Denfu+yaEZW3lZNjVu3uJCgtHsql+oOPqFX94ecuTp4=; fh=pPFR/wQ0k/GuqyNsntNA5LqEpigLoSYvby6BmJvILyA=; b=Y4tFfSLxh+1E7tWyJkgyL8lyT+acL/yAnxpDlRkfAbqpYQ0ZOsqLwWG7z9VgW41OoQ AWzRceLBK5aVqZmJPiPVGpC85O/W2Hc01M0dcPyHHlrsWjXyWxG27gwwdfWHajp7p9A2 vNV08uC9kHPN0SST2Xeqw9DMWecqltH/IScSZRUUCMUmu9aiZ/P4FVHhG6ofrkrZs6Uk dZMuc6vsNW0/LsPNtjzfVhJzFCkdCr10pxpcBs9WGRXmbFgBuxwul1kChyeKSBqeP4wD m3DxQ4aEWe8fzOBrBEyBHhGPmKG0rXOpYIGZiN9FAvBFgU73CJVWinlZL/93e+fTu9HH fywQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=V78q9RHD; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o15-20020a05622a138f00b0041961b33c4fsi1679349qtk.808.2023.10.31.13.09.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:45 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=V78q9RHD; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1AF0A3856DC2 for ; Tue, 31 Oct 2023 20:09:45 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-yw1-x112d.google.com (mail-yw1-x112d.google.com [IPv6:2607:f8b0:4864:20::112d]) by sourceware.org (Postfix) with ESMTPS id B2F153857C43 for ; Tue, 31 Oct 2023 20:09:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B2F153857C43 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B2F153857C43 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::112d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782976; cv=none; b=MUqahYDIsC6nxHZ6L2mDegngIqQa8F+492Xm/nCKB3WNVCXbkg/nXFMemtnzfU4iXlHSdysPfyx5twq4OCevVHltQsX/HjpHXqH3W5JJB7OJxrjtgDGa9pC+y1tPdvR6myCZNdbKdCdXeW8CuzXEiFBMuZPCVymVA/hYkhFb2mw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782976; c=relaxed/simple; bh=uoxo7hUdQgKwNmmyubhslxK3kKPhMwm/tHfmjv/gwPU=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=DJ5CVAESWfQifOJPI+GTDjy3ePiSIoZoNrTzo/uHY27ZWevdEasXEhJ7iQ3FyzbxhFUPevtvP1opX5wq42mvmQaf2Yjy7MrgIprC+ebsj/EK/FABZLq22NS6dUmx0stbNOdsht9HRriiw+39qqM7DTLUqkteRWzArET2dpj9c70= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x112d.google.com with SMTP id 00721157ae682-5a8628e54d4so2244917b3.0 for ; Tue, 31 Oct 2023 13:09:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1698782973; x=1699387773; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Denfu+yaEZW3lZNjVu3uJCgtHsql+oOPqFX94ecuTp4=; b=V78q9RHDY3mKWXRbJjpz+v4ok5tw8+nbJS66Ap2YBzaPYdqKMeFQ9HWxjC6lYGuYRL d5DGgp4n1ZXSFL1ZJoHJbSNQelRCmjYWmDrDRdLw+95KFT1ekXjy2PwaYBSvd/nQaqd1 I0Qa8sfczo+V6yMUIpqwMzwwedGgG50/Hw2Y6kYTVsd5ZKSSf37r+R4GQiu+rs9o0IqQ dUsgN9cuAGNAOQj5W97FReCKKCp31bJs5M5LU4jn5vcDpGNYXZmdoyBPbBpSaNKA5/tr pRJ333HZ2seg02o7hZLkgv6qNfOQ1lyXNxNAdr1XHCaDROdm2uRUIVI97Lshpd1BF4HY JsSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698782973; x=1699387773; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Denfu+yaEZW3lZNjVu3uJCgtHsql+oOPqFX94ecuTp4=; b=lh65Vz72c3D1+FOYEDYd/17Fhu7p0lsPRYtl/+R+l6fCP4yafp9CiF7L5NX23UakAi zogh9Ma/zg+QHblwvT3jsyaPWbZTUz/GATjEwQIdMcEu9vlm3gnr4iShcGt+Lv9GwEOj m++BuiPeKX8xKjRHs+5RtN88goB+F9+NX6nWB2RNkPn2Rp0UXU9MomaLYayljyKwVtSk NG9qCRzBiEpZjlIxMSPoU58Nl28geWfh1qT0WgMj7jiVRBnp8x4zmh9kuPCvLIgz8BK6 XqNendHr+e8tkxVbw9cTphOiRz98AaWs9pRQJbjtzQcX4TezKaImSvOlXYDqvvY8cV+P vP4Q== X-Gm-Message-State: AOJu0YwMn2RJAkJ1Rg1+4UBkTYHswVeXS93JBPUEr+oRpdnzmEFx/VFn ewfEroHeFgV/c/Dy8YlnT01tLKAsZw7i1ziIBJXI1A== X-Received: by 2002:a05:690c:70a:b0:5a7:ba3e:d1d1 with SMTP id bs10-20020a05690c070a00b005a7ba3ed1d1mr601790ywb.25.1698782973290; Tue, 31 Oct 2023 13:09:33 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:3d3c:6c87:9be3:8cfc:976d]) by smtp.gmail.com with ESMTPSA id q69-20020a819948000000b005a7fa3ccb32sm1264111ywg.35.2023.10.31.13.09.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:32 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , "H . J . Lu" , Bruce Merry Subject: [PATCH 2/4] x86: Fix Zen3/Zen4 ERMS selection (BZ 30994) Date: Tue, 31 Oct 2023 17:09:23 -0300 Message-Id: <20231031200925.3297456-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> References: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org The REP MOVSB usage on memcpy/memmove does show any performance gain on Zen3/Zen4 cores compared to the vectorized loops. Also, as from BZ 30994, if source is aligned and destination is not the performance can be as 20x slower. The perfomance differnce is really noticeable with small buffer sizes, closer to the lower bounds limits when memcpy/memmove starts to use ERMS. The performance of REP MOVSB is similar to vectorized instruction on the size limit (the L2 cache). Also, there is not drawnback of multiple cores sharing the cache. A new tunable, glibc.cpu.x86_rep_movsb_stop_threshold, allows to setup the higher bound size to use 'rep movsb'. Checked on x86_64-linux-gnu on Zen3. --- manual/tunables.texi | 9 ++++++ sysdeps/x86/dl-cacheinfo.h | 58 +++++++++++++++++++++++------------- sysdeps/x86/dl-tunables.list | 10 +++++++ 3 files changed, 56 insertions(+), 21 deletions(-) diff --git a/manual/tunables.texi b/manual/tunables.texi index 776fd93fd9..5d3263bc2e 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -570,6 +570,15 @@ greater than zero, and currently defaults to 2048 bytes. This tunable is specific to i386 and x86-64. @end deftp +@deftp Tunable glibc.cpu.x86_rep_movsb_stop_threshold +The @code{glibc.cpu.x86_rep_movsb_threshold} tunable allows the user to +set threshold in bytes to stop using "rep movsb". The value must be +greater than zero, and currently defaults depends of the CPU and the +cache size. + +This tunable is specific to i386 and x86-64. +@end deftp + @deftp Tunable glibc.cpu.x86_rep_stosb_threshold The @code{glibc.cpu.x86_rep_stosb_threshold} tunable allows the user to set threshold in bytes to start using "rep stosb". The value must be diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index 87486054f9..51e5ba200f 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -784,6 +784,14 @@ get_common_cache_info (long int *shared_ptr, long int * shared_per_thread_ptr, u *threads_ptr = threads; } +static inline bool +is_rep_movsb_stop_threshold_valid (unsigned long int v) +{ + unsigned long int rep_movsb_threshold + = TUNABLE_GET (x86_rep_movsb_threshold, long int, NULL); + return v > rep_movsb_threshold; +} + static void dl_init_cacheinfo (struct cpu_features *cpu_features) { @@ -791,7 +799,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) long int data = -1; long int shared = -1; long int shared_per_thread = -1; - long int core = -1; unsigned int threads = 0; unsigned long int level1_icache_size = -1; unsigned long int level1_icache_linesize = -1; @@ -809,7 +816,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (cpu_features->basic.kind == arch_kind_intel) { data = handle_intel (_SC_LEVEL1_DCACHE_SIZE, cpu_features); - core = handle_intel (_SC_LEVEL2_CACHE_SIZE, cpu_features); shared = handle_intel (_SC_LEVEL3_CACHE_SIZE, cpu_features); shared_per_thread = shared; @@ -822,7 +828,8 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) = handle_intel (_SC_LEVEL1_DCACHE_ASSOC, cpu_features); level1_dcache_linesize = handle_intel (_SC_LEVEL1_DCACHE_LINESIZE, cpu_features); - level2_cache_size = core; + level2_cache_size + = handle_intel (_SC_LEVEL2_CACHE_SIZE, cpu_features); level2_cache_assoc = handle_intel (_SC_LEVEL2_CACHE_ASSOC, cpu_features); level2_cache_linesize @@ -835,12 +842,12 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) level4_cache_size = handle_intel (_SC_LEVEL4_CACHE_SIZE, cpu_features); - get_common_cache_info (&shared, &shared_per_thread, &threads, core); + get_common_cache_info (&shared, &shared_per_thread, &threads, + level2_cache_size); } else if (cpu_features->basic.kind == arch_kind_zhaoxin) { data = handle_zhaoxin (_SC_LEVEL1_DCACHE_SIZE); - core = handle_zhaoxin (_SC_LEVEL2_CACHE_SIZE); shared = handle_zhaoxin (_SC_LEVEL3_CACHE_SIZE); shared_per_thread = shared; @@ -849,19 +856,19 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) level1_dcache_size = data; level1_dcache_assoc = handle_zhaoxin (_SC_LEVEL1_DCACHE_ASSOC); level1_dcache_linesize = handle_zhaoxin (_SC_LEVEL1_DCACHE_LINESIZE); - level2_cache_size = core; + level2_cache_size = handle_zhaoxin (_SC_LEVEL2_CACHE_SIZE); level2_cache_assoc = handle_zhaoxin (_SC_LEVEL2_CACHE_ASSOC); level2_cache_linesize = handle_zhaoxin (_SC_LEVEL2_CACHE_LINESIZE); level3_cache_size = shared; level3_cache_assoc = handle_zhaoxin (_SC_LEVEL3_CACHE_ASSOC); level3_cache_linesize = handle_zhaoxin (_SC_LEVEL3_CACHE_LINESIZE); - get_common_cache_info (&shared, &shared_per_thread, &threads, core); + get_common_cache_info (&shared, &shared_per_thread, &threads, + level2_cache_size); } else if (cpu_features->basic.kind == arch_kind_amd) { data = handle_amd (_SC_LEVEL1_DCACHE_SIZE); - core = handle_amd (_SC_LEVEL2_CACHE_SIZE); shared = handle_amd (_SC_LEVEL3_CACHE_SIZE); level1_icache_size = handle_amd (_SC_LEVEL1_ICACHE_SIZE); @@ -869,7 +876,7 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) level1_dcache_size = data; level1_dcache_assoc = handle_amd (_SC_LEVEL1_DCACHE_ASSOC); level1_dcache_linesize = handle_amd (_SC_LEVEL1_DCACHE_LINESIZE); - level2_cache_size = core; + level2_cache_size = handle_amd (_SC_LEVEL2_CACHE_SIZE);; level2_cache_assoc = handle_amd (_SC_LEVEL2_CACHE_ASSOC); level2_cache_linesize = handle_amd (_SC_LEVEL2_CACHE_LINESIZE); level3_cache_size = shared; @@ -880,12 +887,12 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (shared <= 0) { /* No shared L3 cache. All we have is the L2 cache. */ - shared = core; + shared = level2_cache_size; } else if (cpu_features->basic.family < 0x17) { /* Account for exclusive L2 and L3 caches. */ - shared += core; + shared += level2_cache_size; } shared_per_thread = shared; @@ -1028,16 +1035,25 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) SIZE_MAX); unsigned long int rep_movsb_stop_threshold; - /* ERMS feature is implemented from AMD Zen3 architecture and it is - performing poorly for data above L2 cache size. Henceforth, adding - an upper bound threshold parameter to limit the usage of Enhanced - REP MOVSB operations and setting its value to L2 cache size. */ - if (cpu_features->basic.kind == arch_kind_amd) - rep_movsb_stop_threshold = core; - /* Setting the upper bound of ERMS to the computed value of - non-temporal threshold for architectures other than AMD. */ - else - rep_movsb_stop_threshold = non_temporal_threshold; + /* If the tunable is not set or if the value is not larger than + x86_rep_stosb_threshold, use the default values. */ + rep_movsb_stop_threshold = TUNABLE_GET (x86_rep_movsb_stop_threshold, + long int, NULL); + if (!TUNABLE_IS_INITIALIZED (x86_rep_movsb_stop_threshold) + || !is_rep_movsb_stop_threshold_valid (rep_movsb_stop_threshold)) + { + /* For AMD cpus that support ERMS (Zen3+), REP MOVSB is in a lot case + slower than the vectorized path (and for some alignments it is really + slow, check BZ #30994). */ + if (cpu_features->basic.kind == arch_kind_amd) + rep_movsb_stop_threshold = 0; + else + /* Setting the upper bound of ERMS to the computed value of + non-temporal threshold for architectures other than AMD. */ + rep_movsb_stop_threshold = non_temporal_threshold; + } + TUNABLE_SET_WITH_BOUNDS (x86_rep_stosb_threshold, rep_stosb_threshold, 1, + SIZE_MAX); cpu_features->data_cache_size = data; cpu_features->shared_cache_size = shared; diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list index feb7004036..5e9831b610 100644 --- a/sysdeps/x86/dl-tunables.list +++ b/sysdeps/x86/dl-tunables.list @@ -49,6 +49,16 @@ glibc { # if the tunable value is set by user or not [BZ #27069]. minval: 1 } + x86_rep_movsb_stop_threshold { + # For AMD cpus that support ERMS (Zen3+), REP MOVSB is not faster + # than the vectorized path (and for some destination alignment it + # is really slow, check BZ #30994). On Intel cpus, the size limit + # to use ERMS is is [1/8, 1/2] of size of the chip's cache, check + # the dl-cacheinfo.h). + # This tunable allows the caller to setup the limit where to use + # REP MOVB on memcpy/memmove. + type: SIZE_T + } x86_rep_stosb_threshold { type: SIZE_T # Since there is overhead to set up REP STOSB operation, REP STOSB From patchwork Tue Oct 31 20:09:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 739599 Delivered-To: patch@linaro.org Received: by 2002:a5d:4c47:0:b0:32d:baff:b0ca with SMTP id n7csp1841829wrt; Tue, 31 Oct 2023 13:10:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFL6uze7OiPeTNMF+8lZXbu4hU4QdCPcKoXiv8aSMMw5hnUfKsSeMHUzxSe2yWIxUWY24Fu X-Received: by 2002:a05:6830:7191:b0:6cd:9bc:b994 with SMTP id el17-20020a056830719100b006cd09bcb994mr17282172otb.1.1698783008423; Tue, 31 Oct 2023 13:10:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698783008; cv=pass; d=google.com; s=arc-20160816; b=jKlB+0UbBrx5nv58sJgKRx8665Csw7l3NSGuMbAjKCGbUy09jYiB/mDQ5CelbhoLPS rsr/J+7ryLWnRFx8LMOLz26IuCTcsN0pBEf0rfGD30lBSkFcOjR2SfiJjZ6x7e/n2DUE A0dBUSc/QKP4zgqkAwEMjbQszlkmQZHexy6XDdmAQHPdkz8hQLzs07G2I+xYyf5l375G ERx0REaqDl6H85DVLY2KZJ/Qf5trPMGKt7tsIQZh98owYoOX/ymG5Ik7EFjUjIgncNZm hV7t5kv55m98PWEphLldAVfgV5XhsVEs7krcUBVMX9WAadEnW7rdL0BNo6e/s/YogkaV 0x6g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; fh=pPFR/wQ0k/GuqyNsntNA5LqEpigLoSYvby6BmJvILyA=; b=PHh6kjbO/+YQ358IFvAIbLKnhpUC74AfUyHcntBuNKMf9AE2K6ArWG1zu/aQCU0Hxs asTHixiYv/ghlBizCN0RPEwqN9vkv5ZEInpeOR8CBQIYRw5tneD8MN6t9WHINdneDieN UqDfkAfHSNm81YC7X6h4F9S9+yrx7Pz3afSG8qLKE6pIYkoPn+/RT9uwobb5bsnIYJo6 tZbtZXy97HKPiLxnE1L6RwZByGNai4l1iosnOiJ8ygNbCZ8vVLMRXqw1dsM5PTlUOSgD fMrNkXuTTx2AVYqs4b9p0U0w+3Qx8DhZlMjvCWYQaTyqk3rIdTtUvqy4TPjHMwwyKMSa ePgA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i55wSBTN; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u20-20020a05622a011400b0041953bee755si1625881qtw.441.2023.10.31.13.10.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:10:08 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i55wSBTN; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 176083856961 for ; Tue, 31 Oct 2023 20:10:08 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) by sourceware.org (Postfix) with ESMTPS id 96BA5385770A for ; Tue, 31 Oct 2023 20:09:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 96BA5385770A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 96BA5385770A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1133 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782977; cv=none; b=bnxWC4qIapwSlTQjIqUDjJAAjwbLYOHcsfCcRcMJO9b/fbNtXnaCRLqX5oX81R/DxTEVPg2kAq3lVJBsi1Jh0xbTDjv4YzTszz1uuoolbeHicKSvBViCKAdt76ZmARF3HhMirhqWcvnsqsmQNOmFOXanNslkDVxE/yXaf+HkaVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782977; c=relaxed/simple; bh=9WsET7lIW9+kn5hSO4zzd6EWHRIXxU5dVbmf+rh3PlQ=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=nsYBClVN0w00fPLx+qd1ARL9oEc0NpVk27B5/AcAy8I0iPf2vUWw8afr1D27jXJI/+gJ2qY3m5jfRk7PnTekD+JbpgPs1x/ldhmzhv1dQZZAQQhAzPjCXXUURmsfbCZeTofKliS+AGqOa/bMuFbBq27Ud1p2m6FTZ+3new8U2Ro= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-59b5484fbe6so59077927b3.1 for ; Tue, 31 Oct 2023 13:09:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1698782975; x=1699387775; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; b=i55wSBTNTJVklV1cQbqK2E7uLdiLhq0/DlItPufE2Mx6Ju2YrHNeuv3X9L7Ww9DnEb Dq89ojhV7gF8f4uYZFtuTz1DWN/7y5PEj0lswJSML2HMIBb0gxikHtQmVi5Z9e43WUEt nwUlwrfrYSH4TYFP9Sc3JkBBhIg9WMkCsHVaWHAsA9LVyoD9bDvvNXK2315X05jpwi+i 74g+EcB2ki+LvGPShfebxSaB6XA5R7f3LT3s/UZ5+ddihUUJmMYfAtnr23WutDEoOeBA A+8x/hRj0mubgBpj8cBInsScIVr0tNPsuLJZXmJjg6A6xP7KA3GS4CBrcgbrmY0CtHWA Gqzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698782975; x=1699387775; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sWcO/F3jf5++7bvf7k677eZvBleFBNohDCBgOUG9Cw4=; b=Mm+Q6SXE3+ERfzWxvR7evAk4itgKlRD66d/yF/RZisLrg9nKY8KMd0sxFiMsBSKCOs OY10aEwZn5zGh+ZLBG07hGRYfYzgxYejyTOz8I3qOFkD76vK/rz3rqFJ+lbpKwI8rOul xx5E+0KQc5jO1TK3kpo05mVx/8n+626x3Gm/LJPikMCIXn8VoFmYGMWK0F47WI08sx4t TPsQLfSqh5vRzbv9MmjO6o1MEeVsVvBr/9aHiOgB8GOVD9ETVYybPqVVVnRvTFJpfjWr OmQb7zjZbadBCtrRDMwAgJjNY9su74J6LjAHquKv0hmrvjcEIyyIc3372QD7zA+erWhp eiqw== X-Gm-Message-State: AOJu0YxSZvJHptowwTHaFq2LDk1PdRcTVU2NdZocA1F6B965j8gru0Nb 5q6ltt8gj23UK5Ln/hWPv15vjEHcIpo92kVrYvjoyw== X-Received: by 2002:a81:ed0a:0:b0:5a8:3cb:b53d with SMTP id k10-20020a81ed0a000000b005a803cbb53dmr12418187ywm.1.1698782975259; Tue, 31 Oct 2023 13:09:35 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:3d3c:6c87:9be3:8cfc:976d]) by smtp.gmail.com with ESMTPSA id q69-20020a819948000000b005a7fa3ccb32sm1264111ywg.35.2023.10.31.13.09.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:34 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , "H . J . Lu" , Bruce Merry Subject: [PATCH 3/4] x86: Do not prefer ERMS for memset on Zen3+ Date: Tue, 31 Oct 2023 17:09:24 -0300 Message-Id: <20231031200925.3297456-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> References: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org The REP STOSB usage on memset does show any performance gain on Zen3/Zen4 cores compared to the vectorized loops. Checked on x86_64-linux-gnu. --- sysdeps/x86/dl-cacheinfo.h | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index 51e5ba200f..99ba0f776a 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -1018,11 +1018,17 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (tunable_size > minimum_rep_movsb_threshold) rep_movsb_threshold = tunable_size; - /* NB: The default value of the x86_rep_stosb_threshold tunable is the - same as the default value of __x86_rep_stosb_threshold and the - minimum value is fixed. */ - rep_stosb_threshold = TUNABLE_GET (x86_rep_stosb_threshold, - long int, NULL); + /* For AMD Zen3+ architecture, the performance of vectorized loop is + slight better than ERMS. */ + if (cpu_features->basic.kind == arch_kind_amd) + rep_stosb_threshold = SIZE_MAX; + + if (TUNABLE_IS_INITIALIZED (x86_rep_stosb_threshold)) + /* NB: The default value of the x86_rep_stosb_threshold tunable is the + same as the default value of __x86_rep_stosb_threshold and the + minimum value is fixed. */ + rep_stosb_threshold = TUNABLE_GET (x86_rep_stosb_threshold, + long int, NULL); TUNABLE_SET_WITH_BOUNDS (x86_data_cache_size, data, 0, SIZE_MAX); TUNABLE_SET_WITH_BOUNDS (x86_shared_cache_size, shared, 0, SIZE_MAX); From patchwork Tue Oct 31 20:09:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 739598 Delivered-To: patch@linaro.org Received: by 2002:a5d:4c47:0:b0:32d:baff:b0ca with SMTP id n7csp1841770wrt; Tue, 31 Oct 2023 13:10:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IExBjE50le74yLrGnoWW1G3Qh3a5DTFaNIU0C3+2+19PjyYrtY2G9nDjLQxGjLgFl5xeKXz X-Received: by 2002:ad4:596d:0:b0:655:d82d:2fd0 with SMTP id eq13-20020ad4596d000000b00655d82d2fd0mr12717831qvb.21.1698783001048; Tue, 31 Oct 2023 13:10:01 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698783001; cv=pass; d=google.com; s=arc-20160816; b=OjJaAmDcdxrpABU85WAldM6Q9qt/SPST+J1v/wPyXN39BN46U8Q4z1rXopbO32exqH KboME9F3KDgm8V5vtpW9RjOEdvCGeVKIZ0DyRreYJYrWjD0AVrjbE2RvVLrYES8V3CTX cDOumUGevwn86r+lDIohuCczwj9HG5kHG/oUZyZi1ELgCWB6UmIulPkTtWwZW8bl1hRS 3hUI6N0afkCINJtHIetyQBB8+0H+2euliOJRxKA2bWVbbmeZq7uSyRo6Lm7S+Tu009iA zQ20e8BhGtiDT43mw9Nvta5bc3eM7KjQxx+Oy3gYqReMKDr7BG2Yjqtnqmld2wN+QORJ 9xHw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=/ycWYI7V3yRtlW1yrmQJ6N5wMgUs4/SIymF2BafYsjw=; fh=pPFR/wQ0k/GuqyNsntNA5LqEpigLoSYvby6BmJvILyA=; b=M38+Wv2rHdfESPm4O5MwtukEvT1P6rOL1yKIRr0mGKUadKBWfjhXAgWeqqfqbrm4io JqG9ywOr6jSfBolroltS003weJH/t5mGJlMYx/isJn0Ntv5IKmYoj++YDdpkdnThBXM3 +OtrkmDrw/M9/Us9j7SkFLxMH5uL6qU7BIWibT+W3mq6ASKn/1q/uUWkdeU1v0l+poMt NQYBIB5xWd5SmnPdYg/CkqwE0/4xB5rIvjFvWilujvUIOvTLO52aO3r0RyXkKrU1EhAN elKMLcRakGatA51nkcIPWjyEjYYiyXPZ+Ehk1mX+6SVVlMgRMjLm/6cHiOyYocVpSDMm aMnA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=hGe9aw1A; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i7-20020ad44ba7000000b0066db35647d6si1690318qvw.504.2023.10.31.13.10.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:10:01 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=hGe9aw1A; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces+patch=linaro.org@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="libc-alpha-bounces+patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 90CB43857020 for ; Tue, 31 Oct 2023 20:10:00 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-yw1-x1129.google.com (mail-yw1-x1129.google.com [IPv6:2607:f8b0:4864:20::1129]) by sourceware.org (Postfix) with ESMTPS id 56F613857732 for ; Tue, 31 Oct 2023 20:09:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 56F613857732 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 56F613857732 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::1129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782979; cv=none; b=Uz8CMHeRfQijPXD2WLsKsPx7V2PfdoJWNzub+388buOA5kXe97OUO2xWu25SyD7/LcUgeD/hasPWg/a/H9xKb/ox6wVhtg59B2rPT+exKOEL1/3sZsJ80JVEUAU0nlqK6/9+oDQ4+ZwQV0v1vHUffZZvkVvRCMCLJOItmziQRXg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698782979; c=relaxed/simple; bh=kLCbBIidU3UsjLdccD2m4jrtJ6R7Ek0qnK9dQSWWjAM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=X55R0X4ak4BLSnQna2CImOzIY+RlQchWyDbJf4DyDmYS4B4cqglKXP0JbbkrjWAd3RxBml31gagvOOdXAlXTjz25Gt4IWldY1WWtimF3f0fvaQTHiMoWt3geA8rVCY5oHzOiLbedeNZYgz4xG6rjIkIaa6+1I3UqGbnx9nAz3gQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yw1-x1129.google.com with SMTP id 00721157ae682-5a7e5dc8573so59910097b3.0 for ; Tue, 31 Oct 2023 13:09:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1698782977; x=1699387777; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=/ycWYI7V3yRtlW1yrmQJ6N5wMgUs4/SIymF2BafYsjw=; b=hGe9aw1A50CFvi3zpvCjc8PEVyiVzsGq0kTVHX3MOXCvEyAejZXShXSBelqDHwLvLu WR6pwR0GKXG6RLsYtu5FaZvlPqaTuo7qDph5/yGZqNLkIao+fl9QaRjt22X2u43TgQvm q+3vIZphafjq7aqXPgms+M4OdbGVGPIEt2TjnnlsS7vdgybwiTNcUNC7sb6exHRwD8sx Qpumzynz9R3b2R5sYZjCLvY3GuFmqKZ3+BxPv5TYjoxc3RO2vcBCADhI5bXeN60xs8SR XIwebPq6TPfuNaqN6GjkEiR5ABUlEnUs1XxtCsYWOFiexHyiwYpUd4uDo2tmREMYNV9w H5rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698782977; x=1699387777; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/ycWYI7V3yRtlW1yrmQJ6N5wMgUs4/SIymF2BafYsjw=; b=ILUtMW39x3SWtZZc2BZ1ZZCTC9KAIiqUCyZhMCDhqyg4zL7hoJ185pvDqs04bas+l5 w9fjl1MgGxbBmsfZzfk09m1Pmd2KVN/L6KhHMhXSJSxUqiI7Ywd3bEyloXcPmeylCL+Y dAsZNHEIXu024VdSFPkUb1fz+2SC02+RKZhKdrQQLqGJVmOZRhFWwg7vU+lm+fwPIEOK +uxv7Ty8VbuUejZRQu3LpH9HeVJevVBVm+i6yfVLxJsQ/HqmB1H2vqnR/TZsJdZ56GNg qCXYWVQbU8Uoyv3ad45W19pmFxHNlytO9A+Dfio/OCSG72Bkj8W1tpuXiLK3b/+ZrTR5 Rvig== X-Gm-Message-State: AOJu0YwtqVzCXj/NYjIWRaqckXF2Qv7/RKAPEsTazSnmmuHx8MG4ff0F rBhG7XIoiHcJdwhhDrYfvbdCB7ccl4CXQdQE5PMk/w== X-Received: by 2002:a81:d00d:0:b0:592:ffc:c787 with SMTP id v13-20020a81d00d000000b005920ffcc787mr13368074ywi.30.1698782977120; Tue, 31 Oct 2023 13:09:37 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:3d3c:6c87:9be3:8cfc:976d]) by smtp.gmail.com with ESMTPSA id q69-20020a819948000000b005a7fa3ccb32sm1264111ywg.35.2023.10.31.13.09.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Oct 2023 13:09:36 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org, Noah Goldstein , "H . J . Lu" , Bruce Merry Subject: [PATCH 4/4] x86: Expand the comment on when REP STOSB is used on memset Date: Tue, 31 Oct 2023 17:09:25 -0300 Message-Id: <20231031200925.3297456-5-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> References: <20231031200925.3297456-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patch=linaro.org@sourceware.org --- sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S index 3d9ad49cb9..0821b32997 100644 --- a/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S +++ b/sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S @@ -21,7 +21,9 @@ 2. If size is less than VEC, use integer register stores. 3. If size is from VEC_SIZE to 2 * VEC_SIZE, use 2 VEC stores. 4. If size is from 2 * VEC_SIZE to 4 * VEC_SIZE, use 4 VEC stores. - 5. If size is more to 4 * VEC_SIZE, align to 4 * VEC_SIZE with + 5. On machines ERMS feature, if size is greater or equal than + __x86_rep_stosb_threshold then REP STOSB will be used. + 6. If size is more to 4 * VEC_SIZE, align to 4 * VEC_SIZE with 4 VEC stores and store 4 * VEC at a time until done. */ #include