From patchwork Thu Mar 31 13:08:12 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Fischofer X-Patchwork-Id: 64780 Delivered-To: patch@linaro.org Received: by 10.112.199.169 with SMTP id jl9csp148791lbc; Thu, 31 Mar 2016 06:10:17 -0700 (PDT) X-Received: by 10.140.137.135 with SMTP id 129mr17005823qhj.28.1459429817049; Thu, 31 Mar 2016 06:10:17 -0700 (PDT) Return-Path: Received: from lists.linaro.org (lists.linaro.org. [54.225.227.206]) by mx.google.com with ESMTP id f81si7666962qkb.82.2016.03.31.06.10.16; Thu, 31 Mar 2016 06:10:17 -0700 (PDT) Received-SPF: pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.225.227.206 as permitted sender) client-ip=54.225.227.206; Authentication-Results: mx.google.com; spf=pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.225.227.206 as permitted sender) smtp.mailfrom=lng-odp-bounces@lists.linaro.org; dmarc=pass (p=NONE dis=NONE) header.from=linaro.org Received: by lists.linaro.org (Postfix, from userid 109) id AAF0361A05; Thu, 31 Mar 2016 13:10:16 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on ip-10-142-244-252 X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, URIBL_BLOCKED autolearn=disabled version=3.4.0 Received: from [127.0.0.1] (localhost [127.0.0.1]) by lists.linaro.org (Postfix) with ESMTP id 20AAE61A11; Thu, 31 Mar 2016 13:08:37 +0000 (UTC) X-Original-To: lng-odp@lists.linaro.org Delivered-To: lng-odp@lists.linaro.org Received: by lists.linaro.org (Postfix, from userid 109) id 8D62961A0A; Thu, 31 Mar 2016 13:08:24 +0000 (UTC) Received: from mail-oi0-f51.google.com (mail-oi0-f51.google.com [209.85.218.51]) by lists.linaro.org (Postfix) with ESMTPS id BCE7A619B7 for ; Thu, 31 Mar 2016 13:08:20 +0000 (UTC) Received: by mail-oi0-f51.google.com with SMTP id o62so59222829oig.1 for ; Thu, 31 Mar 2016 06:08:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=PYsuuD5600ZrOOTVyt5nw4lZ5WAWilAHun7IDZiWZ+c=; b=JZLw02Gb/v6ncfUSNgWGFulCIBuYkUpGqTzbHfIK7swcwW4/si2Um5aZFj/gqTOPWa 5vKh8Ch0SkBymfjGcy1g/PzVADe8J+y5NmsytyCx70r4IXp9PeIgP5NA8YW76LhVyiEU w63hC10PrScnj5yTRLt7HV1B9qIBH/HS7HH85pGahYD/5tNGETkf1JzPWCa4cC4dasZ6 dYRDxl/GOqyrHU0hTIE3wVh9rhsY0iOCh8TYjYzlVo90hMpp4pg2OFqVLXZO/GOCSqiG 0N1WVuC2gXS0JsKz2FZvi6/2J8+9WMJP0cspeMo53OYmrdUlvr/ZWy+tezrS5Y3e6Vjx rLRA== X-Gm-Message-State: AD7BkJJAzMNup6ZRZfiHY4k13uJ4UrIZEQgxrDsHrpKxldgRldtSxfYro7JuWuACiuHWQNpfKTE= X-Received: by 10.157.12.200 with SMTP id o8mr8523245otd.148.1459429700209; Thu, 31 Mar 2016 06:08:20 -0700 (PDT) Received: from Ubuntu15.localdomain (cpe-66-68-129-43.austin.res.rr.com. [66.68.129.43]) by smtp.gmail.com with ESMTPSA id yn3sm2668627obc.27.2016.03.31.06.08.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 31 Mar 2016 06:08:19 -0700 (PDT) From: Bill Fischofer To: lng-odp@lists.linaro.org Date: Thu, 31 Mar 2016 08:08:12 -0500 Message-Id: <1459429694-6832-4-git-send-email-bill.fischofer@linaro.org> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1459429694-6832-1-git-send-email-bill.fischofer@linaro.org> References: <1459429694-6832-1-git-send-email-bill.fischofer@linaro.org> X-Topics: patch Subject: [lng-odp] [API-NEXT PATCHv7 3/5] linux-generic: Make cpu detection work with NO_HZ_FULL X-BeenThere: lng-odp@lists.linaro.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: "The OpenDataPlane \(ODP\) List" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: lng-odp-bounces@lists.linaro.org Sender: "lng-odp" From: "Gary S. Robertson" sched_getaffinity() and pthread_getaffinity_np() do not return an accurate mask of all CPUs in the machine when the kernel is compiled with NO_HZ_FULL support. See Linaro BUG 2027 for details. https://bugs.linaro.org/show_bug.cgi?id=2027 This code replaces the 'getaffinity' based CPU discovery logic -and- removes any exclusivity between default control and worker cpumasks, based on an assumption that external cpumask specifications will segregate CPUs if needed. The results of these changes which address BUG 2027 are: (1) all CPUs known to the kernel at boot time are considered for use by ODP regardless of the default CPU affinity masks set by the kernel scheduler, (2) the default control and worker cpumasks are given reasonable values based on the set of installed CPUs Also - this code: (a) adds control and worker cpumasks to the linux-generic global data (b) adds logic to odp_cpumask.c to initialize these masks (c) calls the new cpumask initialization logic from odp_init_global() (d) reduces odp_cpumask_default_control() and odp_cpumask_default_worker() to use the content of these new cpumasks without modification. These changes provide prerequisite infrastructure for pending changes which will allow ODP to accept cpumasks passed in from external entities such as a provisioning service. Signed-off-by: Gary S. Robertson Signed-off-by: Bill Fischofer --- platform/linux-generic/include/odp_internal.h | 34 ++++--- platform/linux-generic/odp_cpumask.c | 126 ++++++++++++++++++++++++++ platform/linux-generic/odp_cpumask_task.c | 70 +++++++++----- platform/linux-generic/odp_init.c | 17 +++- platform/linux-generic/odp_system_info.c | 13 +-- 5 files changed, 209 insertions(+), 51 deletions(-) diff --git a/platform/linux-generic/include/odp_internal.h b/platform/linux-generic/include/odp_internal.h index 679d9d2..de9dc6a 100644 --- a/platform/linux-generic/include/odp_internal.h +++ b/platform/linux-generic/include/odp_internal.h @@ -18,6 +18,7 @@ extern "C" { #endif #include +#include #include #include @@ -40,23 +41,27 @@ struct odp_global_data_s { odp_log_func_t log_fn; odp_abort_func_t abort_fn; odp_system_info_t system_info; + odp_cpumask_t control_cpus; + odp_cpumask_t worker_cpus; + int num_cpus_installed; }; enum init_stage { NO_INIT = 0, /* No init stages completed */ - TIME_INIT = 1, - SYSINFO_INIT = 2, - SHM_INIT = 3, - THREAD_INIT = 4, - POOL_INIT = 5, - QUEUE_INIT = 6, - SCHED_INIT = 7, - PKTIO_INIT = 8, - TIMER_INIT = 9, - CRYPTO_INIT = 10, - CLASSIFICATION_INIT = 11, - TRAFFIC_MNGR_INIT = 12, - NAME_TABLE_INIT = 13, + CPUMASK_INIT, + TIME_INIT, + SYSINFO_INIT, + SHM_INIT, + THREAD_INIT, + POOL_INIT, + QUEUE_INIT, + SCHED_INIT, + PKTIO_INIT, + TIMER_INIT, + CRYPTO_INIT, + CLASSIFICATION_INIT, + TRAFFIC_MNGR_INIT, + NAME_TABLE_INIT, ALL_INIT /* All init stages completed */ }; @@ -65,6 +70,9 @@ extern struct odp_global_data_s odp_global_data; int _odp_term_global(enum init_stage stage); int _odp_term_local(enum init_stage stage); +int odp_cpumask_init_global(void); +int odp_cpumask_term_global(void); + int odp_system_info_init(void); int odp_system_info_term(void); diff --git a/platform/linux-generic/odp_cpumask.c b/platform/linux-generic/odp_cpumask.c index 4249f1d..320ca8e 100644 --- a/platform/linux-generic/odp_cpumask.c +++ b/platform/linux-generic/odp_cpumask.c @@ -15,6 +15,10 @@ #include #include +#include +#include +#include + /** @internal Compile time assert */ _ODP_STATIC_ASSERT(CPU_SETSIZE >= ODP_CPUMASK_SIZE, "ODP_CPUMASK_SIZE__SIZE_ERROR"); @@ -208,3 +212,125 @@ int odp_cpumask_next(const odp_cpumask_t *mask, int cpu) return cpu; return -1; } + +/* + * This function obtains system information specifying which cpus are + * available at boot time. These data are then used to produce cpumasks of + * configured CPUs without concern over isolation support. + */ +static int get_installed_cpus(void) +{ + char *numptr; + char *endptr; + long int cpu_idnum; + DIR *d; + struct dirent *dir; + + /* Clear the global cpumasks for control and worker CPUs */ + odp_cpumask_zero(&odp_global_data.control_cpus); + odp_cpumask_zero(&odp_global_data.worker_cpus); + + /* + * Scan the /sysfs pseudo-filesystem for CPU info directories. + * There should be one subdirectory for each installed logical CPU + */ + d = opendir("/sys/devices/system/cpu"); + if (d) { + while ((dir = readdir(d)) != NULL) { + cpu_idnum = CPU_SETSIZE; + + /* + * If the current directory entry doesn't represent + * a CPU info subdirectory then skip to the next entry. + */ + if (dir->d_type == DT_DIR) { + if (!strncmp(dir->d_name, "cpu", 3)) { + /* + * Directory name starts with "cpu"... + * Try to extract a CPU ID number + * from the remainder of the dirname. + */ + errno = 0; + numptr = dir->d_name; + numptr += 3; + cpu_idnum = strtol(numptr, &endptr, + 10); + if (errno || (endptr == numptr)) + continue; + } else { + continue; + } + } else { + continue; + } + /* + * If we get here the current directory entry specifies + * a CPU info subdir for the CPU indexed by cpu_idnum. + */ + + /* Track number of logical CPUs discovered */ + if (odp_global_data.num_cpus_installed < + (int)(cpu_idnum + 1)) + odp_global_data.num_cpus_installed = + (int)(cpu_idnum + 1); + + /* Add the CPU to our default cpumasks */ + odp_cpumask_set(&odp_global_data.control_cpus, + (int)cpu_idnum); + odp_cpumask_set(&odp_global_data.worker_cpus, + (int)cpu_idnum); + } + closedir(d); + return 0; + } else { + return -1; + } +} + +/* + * This function creates reasonable default cpumasks for control and worker + * tasks from the set of CPUs available at boot time. + */ +int odp_cpumask_init_global(void) +{ + odp_cpumask_t *control_mask = &odp_global_data.control_cpus; + odp_cpumask_t *worker_mask = &odp_global_data.worker_cpus; + int i; + int retval = -1; + + if (!get_installed_cpus()) { + /* CPU 0 is only used for workers on uniprocessor systems */ + if (odp_global_data.num_cpus_installed > 1) + odp_cpumask_clr(worker_mask, 0); + /* + * If only one or two CPUs installed, use CPU 0 for control. + * Otherwise leave it for the kernel and start with CPU 1. + */ + if (odp_global_data.num_cpus_installed < 3) { + /* + * If only two CPUS, use CPU 0 for control and + * use CPU 1 for workers. + */ + odp_cpumask_clr(control_mask, 1); + } else { + /* + * If three or more CPUs, reserve CPU 0 for kernel, + * reserve CPU 1 for control, and + * reserve remaining CPUs for workers + */ + odp_cpumask_clr(control_mask, 0); + odp_cpumask_clr(worker_mask, 1); + for (i = 2; i < CPU_SETSIZE; i++) { + if (odp_cpumask_isset(worker_mask, i)) + odp_cpumask_clr(control_mask, i); + } + } + retval = 0; + } + return retval; +} + +int odp_cpumask_term_global(void) +{ + return 0; +} diff --git a/platform/linux-generic/odp_cpumask_task.c b/platform/linux-generic/odp_cpumask_task.c index dbedff2..10885ce 100644 --- a/platform/linux-generic/odp_cpumask_task.c +++ b/platform/linux-generic/odp_cpumask_task.c @@ -14,53 +14,75 @@ int odp_cpumask_default_worker(odp_cpumask_t *mask, int num) { - int ret, cpu, i; - cpu_set_t cpuset; - - ret = pthread_getaffinity_np(pthread_self(), - sizeof(cpu_set_t), &cpuset); - if (ret != 0) - ODP_ABORT("failed to read CPU affinity value\n"); - - odp_cpumask_zero(mask); + odp_cpumask_t overlap; + int cpu, i; /* * If no user supplied number or it's too large, then attempt * to use all CPUs */ - if (0 == num || CPU_SETSIZE < num) - num = CPU_COUNT(&cpuset); + cpu = odp_cpumask_count(&odp_global_data.worker_cpus); + if (0 == num || cpu < num) + num = cpu; /* build the mask, allocating down from highest numbered CPU */ + odp_cpumask_zero(mask); for (cpu = 0, i = CPU_SETSIZE - 1; i >= 0 && cpu < num; --i) { - if (CPU_ISSET(i, &cpuset)) { + if (odp_cpumask_isset(&odp_global_data.worker_cpus, i)) { odp_cpumask_set(mask, i); cpu++; } } - if (odp_cpumask_isset(mask, 0)) - ODP_DBG("\n\tCPU0 will be used for both control and worker threads,\n" - "\tthis will likely have a performance impact on the worker thread.\n"); + odp_cpumask_and(&overlap, mask, &odp_global_data.control_cpus); + if (odp_cpumask_count(&overlap)) + ODP_DBG("\n\tWorker CPUs overlap with control CPUs...\n" + "\tthis will likely have a performance impact on the worker threads.\n"); return cpu; } -int odp_cpumask_default_control(odp_cpumask_t *mask, int num ODP_UNUSED) +int odp_cpumask_default_control(odp_cpumask_t *mask, int num) { + odp_cpumask_t overlap; + int cpu, i; + + /* + * If no user supplied number then default to one control CPU. + */ + if (0 == num) { + num = 1; + } else { + /* + * If user supplied number is too large, then attempt + * to use all installed control CPUs + */ + cpu = odp_cpumask_count(&odp_global_data.control_cpus); + if (cpu < num) + num = cpu; + } + + /* build the mask, allocating upwards from lowest numbered CPU */ odp_cpumask_zero(mask); - /* By default all control threads on CPU 0 */ - odp_cpumask_set(mask, 0); - return 1; + for (cpu = 0, i = 0; i < CPU_SETSIZE && cpu < num; i++) { + if (odp_cpumask_isset(&odp_global_data.control_cpus, i)) { + odp_cpumask_set(mask, i); + cpu++; + } + } + + odp_cpumask_and(&overlap, mask, &odp_global_data.worker_cpus); + if (odp_cpumask_count(&overlap)) + ODP_DBG("\n\tControl CPUs overlap with worker CPUs...\n" + "\tthis will likely have a performance impact on the worker threads.\n"); + + return cpu; } int odp_cpumask_all_available(odp_cpumask_t *mask) { - odp_cpumask_t mask_work, mask_ctrl; - - odp_cpumask_default_worker(&mask_work, 0); - odp_cpumask_default_control(&mask_ctrl, 0); - odp_cpumask_or(mask, &mask_work, &mask_ctrl); + odp_cpumask_or(mask, &odp_global_data.worker_cpus, + &odp_global_data.control_cpus); return odp_cpumask_count(mask); } diff --git a/platform/linux-generic/odp_init.c b/platform/linux-generic/odp_init.c index f30c310..9a923e4 100644 --- a/platform/linux-generic/odp_init.c +++ b/platform/linux-generic/odp_init.c @@ -3,11 +3,9 @@ * * SPDX-License-Identifier: BSD-3-Clause */ - #include -#include -#include #include +#include struct odp_global_data_s odp_global_data; @@ -26,6 +24,12 @@ int odp_init_global(odp_instance_t *instance, odp_global_data.abort_fn = params->abort_fn; } + if (odp_cpumask_init_global()) { + ODP_ERR("ODP cpumask init failed.\n"); + goto init_failed; + } + stage = CPUMASK_INIT; + if (odp_time_init_global()) { ODP_ERR("ODP time init failed.\n"); goto init_failed; @@ -219,6 +223,13 @@ int _odp_term_global(enum init_stage stage) } /* Fall through */ + case CPUMASK_INIT: + if (odp_cpumask_term_global()) { + ODP_ERR("ODP cpumask term failed.\n"); + rc = -1; + } + /* Fall through */ + case NO_INIT: ; } diff --git a/platform/linux-generic/odp_system_info.c b/platform/linux-generic/odp_system_info.c index 395b274..1ecf18a 100644 --- a/platform/linux-generic/odp_system_info.c +++ b/platform/linux-generic/odp_system_info.c @@ -30,21 +30,12 @@ #define HUGE_PAGE_DIR "/sys/kernel/mm/hugepages" - /* - * Report the number of CPUs in the affinity mask of the main thread + * Report the number of logical CPUs detected at boot time */ static int sysconf_cpu_count(void) { - cpu_set_t cpuset; - int ret; - - ret = pthread_getaffinity_np(pthread_self(), - sizeof(cpuset), &cpuset); - if (ret != 0) - return 0; - - return CPU_COUNT(&cpuset); + return odp_global_data.num_cpus_installed; } #if defined __x86_64__ || defined __i386__ || defined __OCTEON__ || \