From patchwork Thu Sep 6 12:00:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Github ODP bot X-Patchwork-Id: 146086 Delivered-To: patch@linaro.org Received: by 2002:a2e:1648:0:0:0:0:0 with SMTP id 8-v6csp417002ljw; Thu, 6 Sep 2018 05:02:25 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb2YN6lYR8xmuk9gDxoiONkh+OOFte1meKvSB3dPJmZW6FJExHlHOvw/xjdiVmTu39pR+nf X-Received: by 2002:a37:5142:: with SMTP id f63-v6mr1586370qkb.347.1536235345560; Thu, 06 Sep 2018 05:02:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536235345; cv=none; d=google.com; s=arc-20160816; b=OC/tIBsaR6sqPdgtA4FtyAcRYBxOiZaohPSdcyuzefoMbxaVA78N+KuGG/sQEnwWyA W7eeociQEogjKA8CwZBuDWKAO+r3ukArO48IWJbaTeza5AJe6zzhgtQzTyKYU3AvRejB iW42p98uKWvMzPjiLw42LGrV2UH49ZkZOZYsrv2qt4vlfOdYbYdCIe4B5DBolbOmFqj0 JzXLuaiwU8nG+9VTFVi0RsUzfCs5pdrKyqda7e8KtHqUYTg96SK3pg1tx8RcQTs1JSGA zb7/Ove527hy/cBWJRqI1coeOqo9okWTIveEd6Y2qUChOg0owLloDl/BsJ55g226p1HT YJog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:github-pr-num :references:in-reply-to:message-id:date:to:from:delivered-to; bh=BMxGVwie1SIGq6yPsnT566p/v1NPCkg0EoPDuipXz5Y=; b=HqnAfUObz3ShsvzhOvhNkABXROlTXNn8oohLCgKvqmEccZRHED9iXwEcKhwhv8OHmr av0pZNJMxEVjjA6lfz9OSxyw2z8ci6XqJz90MeXs0qFwNeFqR5sd2hwOF351xJblB2JZ +fV0q/iO19fxlO28R6tRDH9tmLKcQqJ1w+GhsgHrE2EmzD79b5scsvJlpF1HCDLyophO bc4gmA5n0SsxvpFMNyi7GNRo4BeDq9Htq78ac4Td3vt9Z7FFQ8kqCMKCwwC66xbW82Mv xhc11t+pnvVHACS6Kqjszm55WrcSNoveFeolVguFU3FTwuxYt49NzRJIklJ0+2CP7c5e IVWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) smtp.mailfrom=lng-odp-bounces@lists.linaro.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=yandex.ru Return-Path: Received: from lists.linaro.org (ec2-54-197-127-237.compute-1.amazonaws.com. [54.197.127.237]) by mx.google.com with ESMTP id e3-v6si3600606qtd.39.2018.09.06.05.02.25; Thu, 06 Sep 2018 05:02:25 -0700 (PDT) Received-SPF: pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) client-ip=54.197.127.237; Authentication-Results: mx.google.com; spf=pass (google.com: domain of lng-odp-bounces@lists.linaro.org designates 54.197.127.237 as permitted sender) smtp.mailfrom=lng-odp-bounces@lists.linaro.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=yandex.ru Received: by lists.linaro.org (Postfix, from userid 109) id 3CD8C678A0; Thu, 6 Sep 2018 12:02:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on ip-10-142-244-252 X-Spam-Level: X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,FREEMAIL_FROM, MAILING_LIST_MULTI, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2 autolearn=disabled version=3.4.0 Received: from [127.0.0.1] (localhost [127.0.0.1]) by lists.linaro.org (Postfix) with ESMTP id 2D0FB685D7; Thu, 6 Sep 2018 12:01:38 +0000 (UTC) X-Original-To: lng-odp@lists.linaro.org Delivered-To: lng-odp@lists.linaro.org Received: by lists.linaro.org (Postfix, from userid 109) id 9DBAC678A0; Thu, 6 Sep 2018 12:01:23 +0000 (UTC) Received: from forward103o.mail.yandex.net (forward103o.mail.yandex.net [37.140.190.177]) by lists.linaro.org (Postfix) with ESMTPS id 93EBE678A0 for ; Thu, 6 Sep 2018 12:00:15 +0000 (UTC) Received: from mxback4o.mail.yandex.net (mxback4o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::1e]) by forward103o.mail.yandex.net (Yandex) with ESMTP id EBFCD58871CB for ; Thu, 6 Sep 2018 15:00:13 +0300 (MSK) Received: from smtp3p.mail.yandex.net (smtp3p.mail.yandex.net [2a02:6b8:0:1472:2741:0:8b6:8]) by mxback4o.mail.yandex.net (nwsmtp/Yandex) with ESMTP id kGLRdfwcOA-0DFOI3Ov; Thu, 06 Sep 2018 15:00:13 +0300 Received: by smtp3p.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id FiLUXYlv16-0CGGJQTN; Thu, 06 Sep 2018 15:00:13 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (Client certificate not present) From: Github ODP bot To: lng-odp@lists.linaro.org Date: Thu, 6 Sep 2018 12:00:10 +0000 Message-Id: <1536235211-1215-2-git-send-email-odpbot@yandex.ru> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1536235211-1215-1-git-send-email-odpbot@yandex.ru> References: <1536235211-1215-1-git-send-email-odpbot@yandex.ru> Github-pr-num: 685 Subject: [lng-odp] [PATCH v3 1/2] linux-gen: ishm: implement huge page cache X-BeenThere: lng-odp@lists.linaro.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: "The OpenDataPlane \(ODP\) List" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: lng-odp-bounces@lists.linaro.org Sender: "lng-odp" From: Josep Puigdemont With this patch, ODP will pre-allocate several huge pages at init time. When memory is to be mapped into a huge page, one that was pre-allocated will be used, if available, this way ODP won't have to trap into the kernel to allocate huge pages. The idea with this implementation is to trick ishm into thinking that a file descriptor where to map the memory was provided, this way it it won't try to allocate one itself. This file descriptor is one of those previously allocated at init time. When the system is done with this file descriptor, instead of closing it, it is put back into the list of available huge pages, ready to be reused. A collateral effect of this patch is that memory is not zeroed out when it is reused. WARNING: This patch will not work when using process mode threads. For several reasons, this may not work when using ODP_ISHM_SINGLE_VA either, so when this flag is set, the list of pre-allocated files is not used. By default ODP will not reserve any huge pages, to tell ODP to do that, update the ODP configuration file with something like this: ishm: { num_reserved_hp = 32 } Example usage: $ echo odp.config odp_implementation = "linux-generic" config_file_version = "0.0.1" ishm: { num_reserved_hp = 32 } $ ODP_CONFIG_FILE=odp.conf ./test/validation/api/shmem/shmem_main This patch solves bug #3774: https://bugs.linaro.org/show_bug.cgi?id=3774 Signed-off-by: Josep Puigdemont --- /** Email created from pull request 685 (joseppc:fix/cache_huge_pages) ** https://github.com/Linaro/odp/pull/685 ** Patch: https://github.com/Linaro/odp/pull/685.patch ** Base sha: 6d48d7f7f684b8aa87f7eb4f922d45be345ed771 ** Merge commit sha: 72e8f7c6b712af7da295dbcf41b222e016ee5cc4 **/ config/odp-linux-generic.conf | 10 ++ platform/linux-generic/odp_ishm.c | 218 ++++++++++++++++++++++++++++-- 2 files changed, 214 insertions(+), 14 deletions(-) diff --git a/config/odp-linux-generic.conf b/config/odp-linux-generic.conf index 85d5414ba..d1be5040e 100644 --- a/config/odp-linux-generic.conf +++ b/config/odp-linux-generic.conf @@ -18,6 +18,16 @@ odp_implementation = "linux-generic" config_file_version = "0.0.1" +# Internal shared memory allocator +ishm: { + # ODP will try to reserve as many huge pages as the number indicated + # here, up to 64. Zero or a negative value means that no pages should + # be reserved. + # These pages will only be freed when the application call + # odp_term_global(). + num_reserved_hp = 0 +} + # DPDK pktio options pktio_dpdk: { # Default options diff --git a/platform/linux-generic/odp_ishm.c b/platform/linux-generic/odp_ishm.c index 59d1fe534..b1009355d 100644 --- a/platform/linux-generic/odp_ishm.c +++ b/platform/linux-generic/odp_ishm.c @@ -63,6 +63,7 @@ #include #include #include +#include #include #include #include @@ -164,7 +165,7 @@ typedef struct ishm_fragment { * will allocate both a block and a fragment. * Blocks contain only global data common to all processes. */ -typedef enum {UNKNOWN, HUGE, NORMAL, EXTERNAL} huge_flag_t; +typedef enum {UNKNOWN, HUGE, NORMAL, EXTERNAL, CACHED} huge_flag_t; typedef struct ishm_block { char name[ISHM_NAME_MAXLEN]; /* name for the ishm block (if any) */ char filename[ISHM_FILENAME_MAXLEN]; /* name of the .../odp-* file */ @@ -238,6 +239,16 @@ typedef struct { } ishm_ftable_t; static ishm_ftable_t *ishm_ftbl; +#define HP_CACHE_SIZE 64 +struct huge_page_cache { + uint64_t len; + int total; /* amount of actually pre-allocated huge pages */ + int idx; /* retrieve fd[idx] to get a free file descriptor */ + int fd[HP_CACHE_SIZE]; /* list of file descriptors */ +}; + +static struct huge_page_cache hpc; + #ifndef MAP_ANONYMOUS #define MAP_ANONYMOUS MAP_ANON #endif @@ -245,6 +256,142 @@ static ishm_ftable_t *ishm_ftbl; /* prototypes: */ static void procsync(void); +static int hp_create_file(uint64_t len, const char *filename) +{ + int fd; + void *addr; + + if (len <= 0) { + ODP_ERR("Length is wrong\n"); + return -1; + } + + fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, + S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); + if (fd < 0) { + ODP_ERR("Could not create cache file %s\n", filename); + return -1; + } + + /* remove file from file system */ + unlink(filename); + + if (ftruncate(fd, len) == -1) { + ODP_ERR("Could not truncate file: %s\n", strerror(errno)); + close(fd); + return -1; + } + + /* commit huge page */ + addr = _odp_ishmphy_map(fd, NULL, len, 0); + if (addr == NULL) { + /* no more pages available */ + close(fd); + return -1; + } + _odp_ishmphy_unmap(addr, len, 0); + + ODP_DBG("Created HP cache file %s, fd: %d\n", filename, fd); + + return fd; +} + +static void hp_init(void) +{ + char filename[ISHM_FILENAME_MAXLEN]; + char dir[ISHM_FILENAME_MAXLEN]; + int count; + + hpc.total = 0; + hpc.idx = -1; + hpc.len = odp_sys_huge_page_size(); + + if (!_odp_libconfig_lookup_ext_int("ishm", NULL, "num_reserved_hp", + &count)) { + return; + } + + if (count > HP_CACHE_SIZE) + count = HP_CACHE_SIZE; + else if (count <= 0) + return; + + ODP_DBG("Init HP cache with up to %d pages\n", count); + + if (!odp_global_data.hugepage_info.default_huge_page_dir) { + ODP_ERR("No huge page dir\n"); + return; + } + + snprintf(dir, ISHM_FILENAME_MAXLEN, "%s/%s", + odp_global_data.hugepage_info.default_huge_page_dir, + odp_global_data.uid); + + if (mkdir(dir, 0744) != 0) { + if (errno != EEXIST) { + ODP_ERR("Failed to create dir: %s\n", strerror(errno)); + return; + } + } + + snprintf(filename, ISHM_FILENAME_MAXLEN, + "%s/odp-%d-ishm_cached", + dir, + odp_global_data.main_pid); + + for (int i = 0; i < count; ++i) { + int fd; + + fd = hp_create_file(hpc.len, filename); + if (fd == -1) + break; + hpc.total++; + hpc.fd[i] = fd; + } + hpc.idx = hpc.total - 1; + + ODP_DBG("HP cache has %d huge pages of size 0x%08" PRIx64 "\n", + hpc.total, hpc.len); +} + +static void hp_term(void) +{ + for (int i = 0; i < hpc.total; i++) { + if (hpc.fd[i] != -1) + close(hpc.fd[i]); + } + + hpc.total = 0; + hpc.idx = -1; + hpc.len = 0; +} + +static int hp_get_cached(uint64_t len) +{ + int fd; + + if (hpc.idx < 0 || len != hpc.len) + return -1; + + fd = hpc.fd[hpc.idx]; + hpc.fd[hpc.idx--] = -1; + + return fd; +} + +static int hp_put_cached(int fd) +{ + if (odp_unlikely(++hpc.idx >= hpc.total)) { + hpc.idx--; + ODP_ERR("Trying to put more FD than allowed: %d\n", fd); + return -1; + } + + hpc.fd[hpc.idx] = fd; + + return 0; +} + /* * Take a piece of the preallocated virtual space to fit "size" bytes. * (best fit). Size must be rounded up to an integer number of pages size. @@ -798,8 +945,14 @@ static int block_free_internal(int block_index, int close_fd, int deregister) block_index); /* close the related fd */ - if (close_fd) - close(ishm_proctable->entry[proc_index].fd); + if (close_fd) { + int fd = ishm_proctable->entry[proc_index].fd; + + if (block->huge == CACHED) + hp_put_cached(fd); + else + close(fd); + } /* remove entry from process local table: */ last = ishm_proctable->nb_entries - 1; @@ -910,6 +1063,7 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, new_block->huge = EXTERNAL; } else { new_block->external_fd = 0; + new_block->huge = UNKNOWN; } /* Otherwise, Try first huge pages when possible and needed: */ @@ -927,17 +1081,38 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, /* roundup to page size */ len = (size + (page_hp_size - 1)) & (-page_hp_size); - addr = do_map(new_index, len, hp_align, flags, HUGE, &fd); - - if (addr == NULL) { - if (!huge_error_printed) { - ODP_ERR("No huge pages, fall back to normal " - "pages. " - "check: /proc/sys/vm/nr_hugepages.\n"); - huge_error_printed = 1; + if (!(flags & _ODP_ISHM_SINGLE_VA)) { + /* try pre-allocated pages */ + fd = hp_get_cached(len); + if (fd != -1) { + /* do as if user provided a fd */ + new_block->external_fd = 1; + addr = do_map(new_index, len, hp_align, flags, + CACHED, &fd); + if (addr == NULL) { + ODP_ERR("Could not use cached hp %d\n", + fd); + hp_put_cached(fd); + fd = -1; + } else { + new_block->huge = CACHED; + } + } + } + if (fd == -1) { + addr = do_map(new_index, len, hp_align, flags, HUGE, + &fd); + + if (addr == NULL) { + if (!huge_error_printed) { + ODP_ERR("No huge pages, fall back to " + "normal pages. Check: " + "/proc/sys/vm/nr_hugepages.\n"); + huge_error_printed = 1; + } + } else { + new_block->huge = HUGE; } - } else { - new_block->huge = HUGE; } } @@ -961,8 +1136,12 @@ int _odp_ishm_reserve(const char *name, uint64_t size, int fd, /* if neither huge pages or normal pages works, we cannot proceed: */ if ((fd < 0) || (addr == NULL) || (len == 0)) { - if ((!new_block->external_fd) && (fd >= 0)) + if (new_block->external_fd) { + if (new_block->huge == CACHED) + hp_put_cached(fd); + } else if (fd >= 0) { close(fd); + } delete_file(new_block); odp_spinlock_unlock(&ishm_tbl->lock); ODP_ERR("_ishm_reserve failed.\n"); @@ -1564,6 +1743,9 @@ int _odp_ishm_init_global(const odp_init_t *init) /* get ready to create pools: */ _odp_ishm_pool_init(); + /* init cache files */ + hp_init(); + return 0; init_glob_err4: @@ -1705,6 +1887,8 @@ int _odp_ishm_term_global(void) if (!odp_global_data.shm_dir_from_env) free(odp_global_data.shm_dir); + hp_term(); + return ret; } @@ -1778,6 +1962,9 @@ int _odp_ishm_status(const char *title) case EXTERNAL: huge = 'E'; break; + case CACHED: + huge = 'C'; + break; default: huge = '?'; } @@ -1911,6 +2098,9 @@ void _odp_ishm_print(int block_index) case EXTERNAL: str = "external"; break; + case CACHED: + str = "cached"; + break; default: str = "??"; }