From patchwork Fri Aug 21 15:01:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 262114 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DEE58C433E1 for ; Fri, 21 Aug 2020 15:02:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B3AC120578 for ; Fri, 21 Aug 2020 15:02:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="GvmwI3r6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728077AbgHUPCM (ORCPT ); Fri, 21 Aug 2020 11:02:12 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:62172 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728037AbgHUPBr (ORCPT ); Fri, 21 Aug 2020 11:01:47 -0400 Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07LExm3s019716 for ; Fri, 21 Aug 2020 08:01:46 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=0ZedqTqGMLhs2uHGhYW2j2zHH98meWGrVxntIVY/meE=; b=GvmwI3r6bcRX0tbC05rCRI2rMOHlJmxrayODNjGRROm8FhmgbVBZR0nSHapO6O5RtAF2 aLxeMrrcY09DGHDnKl3cSoYdheGTVvSlBSYHFeQDNqGhV0bciiQ06iD7k/XiY+XVRsR8 Y3y7ZPRvu131gFIVyqL6Qegz9AM3MwSJzCw= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 3304m3da81-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 21 Aug 2020 08:01:46 -0700 Received: from intmgw004.06.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Fri, 21 Aug 2020 08:01:41 -0700 Received: by devvm1096.prn0.facebook.com (Postfix, from userid 111017) id 25EDB3441047; Fri, 21 Aug 2020 08:01:35 -0700 (PDT) Smtp-Origin-Hostprefix: devvm From: Roman Gushchin Smtp-Origin-Hostname: devvm1096.prn0.facebook.com To: CC: , Alexei Starovoitov , Daniel Borkmann , , , Johannes Weiner , Shakeel Butt , , Roman Gushchin Smtp-Origin-Cluster: prn0c01 Subject: [PATCH bpf-next v4 01/30] mm: support nesting memalloc_use_memcg() Date: Fri, 21 Aug 2020 08:01:05 -0700 Message-ID: <20200821150134.2581465-2-guro@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200821150134.2581465-1-guro@fb.com> References: <20200821150134.2581465-1-guro@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-21_08:2020-08-21,2020-08-21 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 mlxlogscore=999 impostorscore=0 mlxscore=0 bulkscore=0 suspectscore=1 spamscore=0 phishscore=0 lowpriorityscore=0 malwarescore=0 clxscore=1015 adultscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008210141 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Johannes Weiner Support nesting of memalloc_use_memcg() to be able to use from an interrupt context. Make memalloc_use_memcg() return the old memcg and convert existing users to a stacking model. Delete the unused memalloc_unuse_memcg(). Roman: I've rephrased the original commit log, because it was focused on the accounting problem related to loop devices. I made it less specific, so it can work for bpf too. Also rebased to the current state of the mm tree. The original patch can be found here: https://lkml.org/lkml/2020/5/28/806 Signed-off-by: Johannes Weiner Signed-off-by: Roman Gushchin --- fs/buffer.c | 6 +++--- fs/notify/fanotify/fanotify.c | 5 +++-- fs/notify/inotify/inotify_fsnotify.c | 5 +++-- include/linux/sched/mm.h | 28 +++++++++------------------- mm/memcontrol.c | 6 +++--- 5 files changed, 21 insertions(+), 29 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 061dd202979d..97ef480db0da 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -842,13 +842,13 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, struct buffer_head *bh, *head; gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT; long offset; - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *old_memcg; if (retry) gfp |= __GFP_NOFAIL; memcg = get_mem_cgroup_from_page(page); - memalloc_use_memcg(memcg); + old_memcg = memalloc_use_memcg(memcg); head = NULL; offset = PAGE_SIZE; @@ -867,7 +867,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size, set_bh_page(bh, page, offset); } out: - memalloc_unuse_memcg(); + memalloc_use_memcg(old_memcg); mem_cgroup_put(memcg); return head; /* diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c index c942910a8649..0e59fa57f6d7 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -531,6 +531,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir); const struct path *path = fsnotify_data_path(data, data_type); unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS); + struct mem_cgroup *old_memcg; struct inode *child = NULL; bool name_event = false; @@ -580,7 +581,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, gfp |= __GFP_RETRY_MAYFAIL; /* Whoever is interested in the event, pays for the allocation. */ - memalloc_use_memcg(group->memcg); + old_memcg = memalloc_use_memcg(group->memcg); if (fanotify_is_perm_event(mask)) { event = fanotify_alloc_perm_event(path, gfp); @@ -608,7 +609,7 @@ static struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group, event->pid = get_pid(task_tgid(current)); out: - memalloc_unuse_memcg(); + memalloc_use_memcg(old_memcg); return event; } diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c index a65cf8c9f600..8017a51561c4 100644 --- a/fs/notify/inotify/inotify_fsnotify.c +++ b/fs/notify/inotify/inotify_fsnotify.c @@ -66,6 +66,7 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask, int ret; int len = 0; int alloc_len = sizeof(struct inotify_event_info); + struct mem_cgroup *old_memcg; if ((inode_mark->mask & FS_EXCL_UNLINK) && path && d_unlinked(path->dentry)) @@ -87,9 +88,9 @@ static int inotify_one_event(struct fsnotify_group *group, u32 mask, * trigger OOM killer in the target monitoring memcg as it may have * security repercussion. */ - memalloc_use_memcg(group->memcg); + old_memcg = memalloc_use_memcg(group->memcg); event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL); - memalloc_unuse_memcg(); + memalloc_use_memcg(old_memcg); if (unlikely(!event)) { /* diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index f889e332912f..b8fde48d44a9 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -312,31 +312,21 @@ static inline void memalloc_nocma_restore(unsigned int flags) * __GFP_ACCOUNT allocations till the end of the scope will be charged to the * given memcg. * - * NOTE: This function is not nesting safe. + * NOTE: This function can nest. Users must save the return value and + * reset the previous value after their own charging scope is over */ -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) +static inline struct mem_cgroup * +memalloc_use_memcg(struct mem_cgroup *memcg) { - WARN_ON_ONCE(current->active_memcg); + struct mem_cgroup *old = current->active_memcg; current->active_memcg = memcg; -} - -/** - * memalloc_unuse_memcg - Ends the remote memcg charging scope. - * - * This function marks the end of the remote memcg charging scope started by - * memalloc_use_memcg(). - */ -static inline void memalloc_unuse_memcg(void) -{ - current->active_memcg = NULL; + return old; } #else -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) -{ -} - -static inline void memalloc_unuse_memcg(void) +static inline struct mem_cgroup * +memalloc_use_memcg(struct mem_cgroup *memcg) { + return NULL; } #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b807952b4d43..b2468c80085d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5271,12 +5271,12 @@ static struct cgroup_subsys_state * __ref mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) { struct mem_cgroup *parent = mem_cgroup_from_css(parent_css); - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *old_memcg; long error = -ENOMEM; - memalloc_use_memcg(parent); + old_memcg = memalloc_use_memcg(parent); memcg = mem_cgroup_alloc(); - memalloc_unuse_memcg(); + memalloc_use_memcg(old_memcg); if (IS_ERR(memcg)) return ERR_CAST(memcg); From patchwork Fri Aug 21 15:01:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 262115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5855BC433DF for ; Fri, 21 Aug 2020 15:01:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 35B7B207C3 for ; Fri, 21 Aug 2020 15:01:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="X9UBRfaZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728119AbgHUPBx (ORCPT ); Fri, 21 Aug 2020 11:01:53 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:30308 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728069AbgHUPBo (ORCPT ); Fri, 21 Aug 2020 11:01:44 -0400 Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07LF1Xav013578 for ; Fri, 21 Aug 2020 08:01:44 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=Ul+xf1TI54FnPjFpxc3Ihst3TQMepDvmciNWWocFk1Q=; b=X9UBRfaZyQqWzJ9tvDuKhWjfiKeHCzV8LSU6YcfZQqOEyfgA1Iu7cmymBDHunq6oDngT qPnd/sfDuIbLthIrW0mC6czuKOVvFwjEXSqZpgBg2RnQBmb03ZZy/yusAU9zlsfskQmW qNbD42wqrAo7zWO0iiRwsRsLBllgeViIPPE= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 331d50t1dc-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 21 Aug 2020 08:01:44 -0700 Received: from intmgw004.06.prn3.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Fri, 21 Aug 2020 08:01:41 -0700 Received: by devvm1096.prn0.facebook.com (Postfix, from userid 111017) id 3163A344104B; Fri, 21 Aug 2020 08:01:35 -0700 (PDT) Smtp-Origin-Hostprefix: devvm From: Roman Gushchin Smtp-Origin-Hostname: devvm1096.prn0.facebook.com To: CC: , Alexei Starovoitov , Daniel Borkmann , , , Johannes Weiner , Shakeel Butt , , Roman Gushchin Smtp-Origin-Cluster: prn0c01 Subject: [PATCH bpf-next v4 03/30] bpf: memcg-based memory accounting for bpf maps Date: Fri, 21 Aug 2020 08:01:07 -0700 Message-ID: <20200821150134.2581465-4-guro@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200821150134.2581465-1-guro@fb.com> References: <20200821150134.2581465-1-guro@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-21_08:2020-08-21,2020-08-21 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 spamscore=0 malwarescore=0 mlxlogscore=999 priorityscore=1501 impostorscore=0 clxscore=1015 phishscore=0 suspectscore=38 adultscore=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008210141 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch enables memcg-based memory accounting for memory allocated by __bpf_map_area_alloc(), which is used by most map types for large allocations. If a map is updated from an interrupt context, and the update results in memory allocation, the memory cgroup can't be determined from the context of the current process. To address this case, bpf map preserves a pointer to the memory cgroup of the process, which created the map. This memory cgroup is charged for allocations from interrupt context. Following patches in the series will refine the accounting for some map types. Signed-off-by: Roman Gushchin --- include/linux/bpf.h | 4 ++++ kernel/bpf/helpers.c | 37 ++++++++++++++++++++++++++++++++++++- kernel/bpf/syscall.c | 27 ++++++++++++++++++++++++++- 3 files changed, 66 insertions(+), 2 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a9b7185a6b37..b5f178afde94 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -34,6 +34,7 @@ struct btf_type; struct exception_table_entry; struct seq_operations; struct bpf_iter_aux_info; +struct mem_cgroup; extern struct idr btf_idr; extern spinlock_t btf_idr_lock; @@ -138,6 +139,9 @@ struct bpf_map { u32 btf_value_type_id; struct btf *btf; struct bpf_map_memory memory; +#ifdef CONFIG_MEMCG_KMEM + struct mem_cgroup *memcg; +#endif char name[BPF_OBJ_NAME_LEN]; u32 btf_vmlinux_value_type_id; bool bypass_spec_v1; diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index be43ab3e619f..f8ce7bc7003f 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "../../lib/kstrtox.h" @@ -41,11 +42,45 @@ const struct bpf_func_proto bpf_map_lookup_elem_proto = { .arg2_type = ARG_PTR_TO_MAP_KEY, }; +#ifdef CONFIG_MEMCG_KMEM +static __always_inline int __bpf_map_update_elem(struct bpf_map *map, void *key, + void *value, u64 flags) +{ + struct mem_cgroup *old_memcg; + bool in_interrupt; + int ret; + + /* + * If update from an interrupt context results in a memory allocation, + * the memory cgroup to charge can't be determined from the context + * of the current task. Instead, we charge the memory cgroup, which + * contained a process created the map. + */ + in_interrupt = in_interrupt(); + if (in_interrupt) + old_memcg = memalloc_use_memcg(map->memcg); + + ret = map->ops->map_update_elem(map, key, value, flags); + + if (in_interrupt) + memalloc_use_memcg(old_memcg); + + return ret; +} +#else +static __always_inline int __bpf_map_update_elem(struct bpf_map *map, void *key, + void *value, u64 flags) +{ + return map->ops->map_update_elem(map, key, value, flags); +} +#endif + BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key, void *, value, u64, flags) { WARN_ON_ONCE(!rcu_read_lock_held()); - return map->ops->map_update_elem(map, key, value, flags); + + return __bpf_map_update_elem(map, key, value, flags); } const struct bpf_func_proto bpf_map_update_elem_proto = { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 689d736b6904..683614c17a95 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -29,6 +29,7 @@ #include #include #include +#include #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \ (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \ @@ -275,7 +276,7 @@ static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) * __GFP_RETRY_MAYFAIL to avoid such situations. */ - const gfp_t gfp = __GFP_NOWARN | __GFP_ZERO; + const gfp_t gfp = __GFP_NOWARN | __GFP_ZERO | __GFP_ACCOUNT; unsigned int flags = 0; unsigned long align = 1; void *area; @@ -452,6 +453,27 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock) __release(&map_idr_lock); } +#ifdef CONFIG_MEMCG_KMEM +static void bpf_map_save_memcg(struct bpf_map *map) +{ + map->memcg = get_mem_cgroup_from_mm(current->mm); +} + +static void bpf_map_release_memcg(struct bpf_map *map) +{ + mem_cgroup_put(map->memcg); +} + +#else +static void bpf_map_save_memcg(struct bpf_map *map) +{ +} + +static void bpf_map_release_memcg(struct bpf_map *map) +{ +} +#endif + /* called from workqueue */ static void bpf_map_free_deferred(struct work_struct *work) { @@ -463,6 +485,7 @@ static void bpf_map_free_deferred(struct work_struct *work) /* implementation dependent freeing */ map->ops->map_free(map); bpf_map_charge_finish(&mem); + bpf_map_release_memcg(map); } static void bpf_map_put_uref(struct bpf_map *map) @@ -869,6 +892,8 @@ static int map_create(union bpf_attr *attr) if (err) goto free_map_sec; + bpf_map_save_memcg(map); + err = bpf_map_new_fd(map, f_flags); if (err < 0) { /* failed to allocate fd. From patchwork Fri Aug 21 15:01:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 262113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1E78C433DF for ; Fri, 21 Aug 2020 15:07:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A77AF20656 for ; Fri, 21 Aug 2020 15:07:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="cLxEmRzX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728063AbgHUPHF (ORCPT ); Fri, 21 Aug 2020 11:07:05 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:32116 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727945AbgHUPGu (ORCPT ); Fri, 21 Aug 2020 11:06:50 -0400 Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 07LF3sMQ019136 for ; Fri, 21 Aug 2020 08:06:48 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=RZYe74Umrx4SuZo82U3dVJ635qqku6CpK0PXhAa9zAQ=; b=cLxEmRzXscG58p7pjoeY9Bt63iqgWSCYGe1GZCekwZ1Ev7ZpZj5KPKSpB/GKYZd/pUoY Z6BAkQ3h3PAJYxwECwvIBAnhIxth0SR5eLPB3SukbHSONbQ+cdN3W85yNc8qMrbBLrWg ESiA9z61XbruDivix7d35oaLa6qMwejvnSs= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com with ESMTP id 3304kq50s2-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 21 Aug 2020 08:06:48 -0700 Received: from intmgw004.06.prn3.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1979.3; Fri, 21 Aug 2020 08:06:43 -0700 Received: by devvm1096.prn0.facebook.com (Postfix, from userid 111017) id B4D80344107F; Fri, 21 Aug 2020 08:01:35 -0700 (PDT) Smtp-Origin-Hostprefix: devvm From: Roman Gushchin Smtp-Origin-Hostname: devvm1096.prn0.facebook.com To: CC: , Alexei Starovoitov , Daniel Borkmann , , , Johannes Weiner , Shakeel Butt , , Roman Gushchin Smtp-Origin-Cluster: prn0c01 Subject: [PATCH bpf-next v4 28/30] bpf: eliminate rlimit-based memory accounting infra for bpf maps Date: Fri, 21 Aug 2020 08:01:32 -0700 Message-ID: <20200821150134.2581465-29-guro@fb.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200821150134.2581465-1-guro@fb.com> References: <20200821150134.2581465-1-guro@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-21_08:2020-08-21,2020-08-21 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 clxscore=1015 spamscore=0 impostorscore=0 mlxscore=0 adultscore=0 malwarescore=0 mlxlogscore=999 lowpriorityscore=0 suspectscore=38 phishscore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008210142 X-FB-Internal: deliver Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Remove rlimit-based accounting infrastructure code, which is not used anymore. Signed-off-by: Roman Gushchin --- include/linux/bpf.h | 12 ---- kernel/bpf/syscall.c | 64 +------------------ .../selftests/bpf/progs/map_ptr_kern.c | 5 -- 3 files changed, 2 insertions(+), 79 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index b5f178afde94..7f81cbb981a6 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -113,11 +113,6 @@ struct bpf_map_ops { const struct bpf_iter_seq_info *iter_seq_info; }; -struct bpf_map_memory { - u32 pages; - struct user_struct *user; -}; - struct bpf_map { /* The first two cachelines with read-mostly members of which some * are also accessed in fast-path (e.g. ops, max_entries). @@ -138,7 +133,6 @@ struct bpf_map { u32 btf_key_type_id; u32 btf_value_type_id; struct btf *btf; - struct bpf_map_memory memory; #ifdef CONFIG_MEMCG_KMEM struct mem_cgroup *memcg; #endif @@ -1148,12 +1142,6 @@ void bpf_map_inc_with_uref(struct bpf_map *map); struct bpf_map * __must_check bpf_map_inc_not_zero(struct bpf_map *map); void bpf_map_put_with_uref(struct bpf_map *map); void bpf_map_put(struct bpf_map *map); -int bpf_map_charge_memlock(struct bpf_map *map, u32 pages); -void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages); -int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size); -void bpf_map_charge_finish(struct bpf_map_memory *mem); -void bpf_map_charge_move(struct bpf_map_memory *dst, - struct bpf_map_memory *src); void *bpf_map_area_alloc(u64 size, int numa_node); void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); void bpf_map_area_free(void *base); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 683614c17a95..392e3b2f58e4 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -355,60 +355,6 @@ static void bpf_uncharge_memlock(struct user_struct *user, u32 pages) atomic_long_sub(pages, &user->locked_vm); } -int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size) -{ - u32 pages = round_up(size, PAGE_SIZE) >> PAGE_SHIFT; - struct user_struct *user; - int ret; - - if (size >= U32_MAX - PAGE_SIZE) - return -E2BIG; - - user = get_current_user(); - ret = bpf_charge_memlock(user, pages); - if (ret) { - free_uid(user); - return ret; - } - - mem->pages = pages; - mem->user = user; - - return 0; -} - -void bpf_map_charge_finish(struct bpf_map_memory *mem) -{ - bpf_uncharge_memlock(mem->user, mem->pages); - free_uid(mem->user); -} - -void bpf_map_charge_move(struct bpf_map_memory *dst, - struct bpf_map_memory *src) -{ - *dst = *src; - - /* Make sure src will not be used for the redundant uncharging. */ - memset(src, 0, sizeof(struct bpf_map_memory)); -} - -int bpf_map_charge_memlock(struct bpf_map *map, u32 pages) -{ - int ret; - - ret = bpf_charge_memlock(map->memory.user, pages); - if (ret) - return ret; - map->memory.pages += pages; - return ret; -} - -void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages) -{ - bpf_uncharge_memlock(map->memory.user, pages); - map->memory.pages -= pages; -} - static int bpf_map_alloc_id(struct bpf_map *map) { int id; @@ -478,13 +424,10 @@ static void bpf_map_release_memcg(struct bpf_map *map) static void bpf_map_free_deferred(struct work_struct *work) { struct bpf_map *map = container_of(work, struct bpf_map, work); - struct bpf_map_memory mem; - bpf_map_charge_move(&mem, &map->memory); security_bpf_map_free(map); /* implementation dependent freeing */ map->ops->map_free(map); - bpf_map_charge_finish(&mem); bpf_map_release_memcg(map); } @@ -564,7 +507,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) "value_size:\t%u\n" "max_entries:\t%u\n" "map_flags:\t%#x\n" - "memlock:\t%llu\n" + "memlock:\t%llu\n" /* deprecated */ "map_id:\t%u\n" "frozen:\t%u\n", map->map_type, @@ -572,7 +515,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) map->value_size, map->max_entries, map->map_flags, - map->memory.pages * 1ULL << PAGE_SHIFT, + 0LLU, map->id, READ_ONCE(map->frozen)); if (type) { @@ -813,7 +756,6 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf, static int map_create(union bpf_attr *attr) { int numa_node = bpf_map_attr_numa_node(attr); - struct bpf_map_memory mem; struct bpf_map *map; int f_flags; int err; @@ -912,9 +854,7 @@ static int map_create(union bpf_attr *attr) security_bpf_map_free(map); free_map: btf_put(map->btf); - bpf_map_charge_move(&mem, &map->memory); map->ops->map_free(map); - bpf_map_charge_finish(&mem); return err; } diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c index 473665cac67e..49d1dcaf7999 100644 --- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c +++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c @@ -26,17 +26,12 @@ __u32 g_line = 0; return 0; \ }) -struct bpf_map_memory { - __u32 pages; -} __attribute__((preserve_access_index)); - struct bpf_map { enum bpf_map_type map_type; __u32 key_size; __u32 value_size; __u32 max_entries; __u32 id; - struct bpf_map_memory memory; } __attribute__((preserve_access_index)); static inline int check_bpf_map_fields(struct bpf_map *map, __u32 key_size,