From patchwork Sat Jan 15 01:05:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C91D2C433FE for ; Sat, 15 Jan 2022 01:07:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231777AbiAOBHY (ORCPT ); Fri, 14 Jan 2022 20:07:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229534AbiAOBHX (ORCPT ); Fri, 14 Jan 2022 20:07:23 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 42905C06173E for ; Fri, 14 Jan 2022 17:07:23 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id h2-20020a5b0a82000000b0061192499188so21621042ybq.9 for ; Fri, 14 Jan 2022 17:07:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=VVL93OjDpl50EsTf01Y02AFzpDgsoiwDg8IFEaHEx7o=; b=SeDudgjZy+YJyTXR9JLoFhR5vDMcIwZaaa8yQTqwNo0eCjpi1e+Z6fuk5CVH0j8T6Q yftZP1F1lLghLhdqxp/d2Hi9b3XVn1g1t87rKY7jFrOaC36QUheGeR1xOTUz/2bLyPhV 62QLjtSW0fnKArrUMAigsH+EedrTR+GBYAk7odkg7PsOu9hJD19pSdQygjNHP69Oxkqd AAUXclJGnEsBrXj622qDr25tJi+BZW4ce4s3tyPfO7BAn5XNcrCyZAECVNeP7XHBD3HS ZHn3F7wLmnaUtrfd6x1VO2YBDfTcTt+rETEo/hAE820ttJA+H9Wi63ki5ATMsl4pmovf PDfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=VVL93OjDpl50EsTf01Y02AFzpDgsoiwDg8IFEaHEx7o=; b=duPdqty45lNXOuFuAMLVpxD5donzL0OgrhSaPrj+jSP1AV/gAk1rMsE/wkBgpkE2GP NKLfhrHQt+lglODkbeKSuLy2Zlg7rtO8qNqQibLLhyi0oN20QCala6CB5nJl8VpUM4wu OBI/rdg3M4WnoF0Dva14JOJSyqpFfAlYSKOMwtSdHwzi/v+d0BqjIY4Aj6BjtOvcMBru tZBwDdLebVX+LqxetWtHUAAPh4tKk6njw2ogI3tlW3+65A4h4+N4spFhYGsth0G/+7La sVI2u5v3FHjfKN16RiLni+z09udsqCkSHUJz0jVlMWjtY93MhaEjw0qNp1YThbYeiIcD QYjg== X-Gm-Message-State: AOAM530rSQlyvHH9916NGZUHAH0daJ/YoLxsUN6m6myNRvLTb7dzF/b0 U91Kaxrqiasc1EsbsI9EPAdXpfQRQcI= X-Google-Smtp-Source: ABdhPJzyM7cNAy3s+OktTpih4qEUrIx/QPHtlR7VJWxQbBa5BvlJwh/TEx9gX5GKJVVbkPqjsVs2iQfq0sE= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a25:bf82:: with SMTP id l2mr16594693ybk.356.1642208842425; Fri, 14 Jan 2022 17:07:22 -0800 (PST) Date: Fri, 14 Jan 2022 17:05:59 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-2-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 1/6] gpu: rfc: Proposal for a GPU cgroup controller From: Hridya Valsaraju To: David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Kenneth Graunke , Simon Ser , Jason Ekstrand , Matthew Auld , Matthew Brost , Li Li , Marco Ballesio , Finn Behrens , Hang Lu , Wedson Almeida Filho , Masahiro Yamada , Andrew Morton , Nathan Chancellor , Kees Cook , Nick Desaulniers , Miguel Ojeda , Vipin Sharma , Chris Down , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org This patch adds a proposal for a new GPU cgroup controller for accounting/limiting GPU and GPU-related memory allocations. The proposed controller is based on the DRM cgroup controller[1] and follows the design of the RDMA cgroup controller. The new cgroup controller would: * Allow setting per-cgroup limits on the total size of buffers charged to it. * Allow setting per-device limits on the total size of buffers allocated by device within a cgroup. * Expose a per-device/allocator breakdown of the buffers charged to a cgroup. The prototype in the following patches are only for memory accounting using the GPU cgroup controller and does not implement limit setting. [1]: https://lore.kernel.org/amd-gfx/20210126214626.16260-1-brian.welty@intel.com/ Signed-off-by: Hridya Valsaraju --- Hi all, Here is the RFC documentation for the GPU cgroup controller that we talked about at LPC 2021 along with a prototype. I reached out to Tejun with the idea recently and he mentioned that cgroup-aware BPF(by Kenny Ho) or the new misc cgroup controller can also be considered as alternatives to track GPU resources. I am sending the RFC to the list to give everyone else a chance to chime in with their thoughts as well so that we can reach an agreement on how to proceed. Thanks in advance! Regards, Hridya Documentation/gpu/rfc/gpu-cgroup.rst | 192 +++++++++++++++++++++++++++ Documentation/gpu/rfc/index.rst | 4 + 2 files changed, 196 insertions(+) create mode 100644 Documentation/gpu/rfc/gpu-cgroup.rst diff --git a/Documentation/gpu/rfc/gpu-cgroup.rst b/Documentation/gpu/rfc/gpu-cgroup.rst new file mode 100644 index 000000000000..9bff23007b22 --- /dev/null +++ b/Documentation/gpu/rfc/gpu-cgroup.rst @@ -0,0 +1,192 @@ +=================================== +GPU cgroup controller +=================================== + +Goals +===== +This document intends to outline a plan to create a cgroup v2 controller subsystem +for the per-cgroup accounting of device and system memory allocated by the GPU +and related subsystems. + +The new cgroup controller would: + +* Allow setting per-cgroup limits on the total size of buffers charged to it. + +* Allow setting per-device limits on the total size of buffers allocated by a + device/allocator within a cgroup. + +* Expose a per-device/allocator breakdown of the buffers charged to a cgroup. + +Alternatives Considered +======================= + +The following alternatives were considered: + +The memory cgroup controller +____________________________ + +1. As was noted in [1], memory accounting provided by the GPU cgroup +controller is not a good fit for integration into memcg due to the +differences in how accounting is performed. It implements a mechanism +for the allocator attribution of GPU and GPU-related memory by +charging each buffer to the cgroup of the process on behalf of which +the memory was allocated. The buffer stays charged to the cgroup until +it is freed regardless of whether the process retains any references +to it. On the other hand, the memory cgroup controller offers a more +fine-grained charging and uncharging behavior depending on the kind of +page being accounted. + +2. Memcg performs accounting in units of pages. In the DMA-BUF buffer sharing model, +a process takes a reference to the entire buffer(hence keeping it alive) even if +it is only accessing parts of it. Therefore, per-page memory tracking for DMA-BUF +memory accounting would only introduce additional overhead without any benefits. + +[1]: https://patchwork.kernel.org/project/dri-devel/cover/20190501140438.9506-1-brian.welty@intel.com/#22624705 + +Userspace service to keep track of buffer allocations and releases +__________________________________________________________________ + +1. There is no way for a userspace service to intercept all allocations and releases. +2. In case the process gets killed or restarted, we lose all accounting so far. + +UAPI +==== +When enabled, the new cgroup controller would create the following files in every cgroup. + +:: + + gpu.memory.current (R) + gpu.memory.max (R/W) + +gpu.memory.current is a read-only file and would contain per-device memory allocations +in a key-value format where key is a string representing the device name +and the value is the size of memory charged to the device in the cgroup in bytes. + +For example: + +:: + + cat /sys/kernel/fs/cgroup1/gpu.memory.current + dev1 4194304 + dev2 4194304 + +The string key for each device is set by the device driver when the device registers +with the GPU cgroup controller to participate in resource accounting(see section +'Design and Implementation' for more details). + +gpu.memory.max is a read/write file. It would show the current total +size limits on memory usage for the cgroup and the limits on total memory usage +for each allocator/device. + +Setting a total limit for a cgroup can be done as follows: + +:: + + echo “total 41943040” > /sys/kernel/fs/cgroup1/gpu.memory.max + +Setting a total limit for a particular device/allocator can be done as follows: + +:: + + echo “dev1 4194304” > /sys/kernel/fs/cgroup1/gpu.memory.max + +In this example, 'dev1' is the string key set by the device driver during +registration. + +Design and Implementation +========================= + +The cgroup controller would closely follow the design of the RDMA cgroup controller +subsystem where each cgroup maintains a list of resource pools. +Each resource pool contains a struct device and the counter to track current total, +and the maximum limit set for the device. + +The below code block is a preliminary estimation on how the core kernel data structures +and APIs would look like. + +.. code-block:: c + + /** + * The GPU cgroup controller data structure. + */ + struct gpucg { + struct cgroup_subsys_state css; + /* list of all resource pools that belong to this cgroup */ + struct list_head rpools; + }; + + struct gpucg_device { + /* + * list of various resource pools in various cgroups that the device is + * part of. + */ + struct list_head rpools; + /* list of all devices registered for GPU cgroup accounting */ + struct list_head dev_node; + /* name to be used as identifier for accounting and limit setting */ + const char *name; + }; + + struct gpucg_resource_pool { + /* The device whose resource usage is tracked by this resource pool */ + struct gpucg_device *device; + + /* list of all resource pools for the cgroup */ + struct list_head cg_node; + + /* + * list maintained by the gpucg_device to keep track of its + * resource pools + */ + struct list_head dev_node; + + /* tracks memory usage of the resource pool */ + struct page_counter total; + }; + + /** + * gpucg_register_device - Registers a device for memory accounting using the + * GPU cgroup controller. + * + * @device: The device to register for memory accounting. Must remain valid + * after registration. + * @name: Pointer to a string literal to denote the name of the device. + */ + void gpucg_register_device(struct gpucg_device *gpucg_dev, const char *name); + + /** + * gpucg_try_charge - charge memory to the specified gpucg and gpucg_device. + * + * @gpucg: The gpu cgroup to charge the memory to. + * @device: The device to charge the memory to. + * @usage: size of memory to charge in bytes. + * + * Return: returns 0 if the charging is successful and otherwise returns an + * error code. + */ + int gpucg_try_charge(struct gpucg *gpucg, struct gpucg_device *device, u64 usage); + + /** + * gpucg_uncharge - uncharge memory from the specified gpucg and gpucg_device. + * + * @gpucg: The gpu cgroup to uncharge the memory from. + * @device: The device to charge the memory from. + * @usage: size of memory to uncharge in bytes. + */ + void gpucg_uncharge(struct gpucg *gpucg, struct gpucg_device *device, u64 usage); + +Future Work +=========== +Additional GPU resources can be supported by adding new controller files. + +Upstreaming Plan +================ +* Decide on a UAPI that accommodates all use-cases for the upstream GPU ecosystem + as well as for Android. + +* Prototype the GPU cgroup controller and integrate its usage into the DMA-BUF + system heap. + +* Demonstrate its usage from userspace in the Android Open Space Project. + +* Send out RFCs to LKML for the GPU cgroup controller and iterate. diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst index 91e93a705230..0a9bcd94e95d 100644 --- a/Documentation/gpu/rfc/index.rst +++ b/Documentation/gpu/rfc/index.rst @@ -23,3 +23,7 @@ host such documentation: .. toctree:: i915_scheduler.rst + +.. toctree:: + + gpu-cgroup.rst From patchwork Sat Jan 15 01:06:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A36CC433FE for ; Sat, 15 Jan 2022 01:07:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231786AbiAOBHp (ORCPT ); Fri, 14 Jan 2022 20:07:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231788AbiAOBHp (ORCPT ); Fri, 14 Jan 2022 20:07:45 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73E04C06161C for ; Fri, 14 Jan 2022 17:07:44 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id u185-20020a2560c2000000b0060fd98540f7so21794340ybb.0 for ; Fri, 14 Jan 2022 17:07:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=CR8oumQ41fWbohZqu7VQHjOAD/Bx/ZcAVT1B4LsANwg=; b=lGXcxF63iPTNTbS5H4rTR2A4N0gg/FIHhngKYtgJLHSd7RYuPLb//LBZvaUW8W04t3 TgKUvihd0nLul6DoxQ9aIBMIJ6YDRpPc0EDT3xRWVtVXZvnigi3RY2mS/ceCPIyZM8VE XkLZ0Kbj+pnM3SwWQ/tvolKEYQzwYBty7Rkz+9UftAFbpEGH6G4OTDdvHo17DbUQsuML cUL4oopIaW2nHdXmLetzejLubLnTh5hHD71RLk2ZdLviUXuxdm4xTiPRdrsTi8ZudvWu pVDkZU630F/BYVEWjnaorjpfZ7ntOMRnBtWIJqaspziUUFylYdDY4BvTX4namM2axY+Q uFUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=CR8oumQ41fWbohZqu7VQHjOAD/Bx/ZcAVT1B4LsANwg=; b=SJd8XP8ORhiW16/zmedmtrqCW8inidMbuy9w2T6oXAKvNw9GvOyHjW0nGKnH/OXKR5 MkVGjEgZohzUtdPfJT0HnqCR9GcKJKo4s26ERLwqP5v1IgpleoDm6Ece5VD29aXQVYWs icKPdSUxwZbUWkeIdcFZyoepxT5e2JGLw6wtiojTiOeMFeHLsYApLC1SubS6alFcjatd 0ro/e2HLWSUIeSulGsIJElSknBZrTcT2WUet7e6VP93SD4KmVUpiZaa7TVnr+b6obHPt 8gN8Kc97b2h3cgirwU8JtRb7e/bc3OF3LXyMaUm3i4+ofAq/3QOUjPhr8wSg2NmrP2nr zf6A== X-Gm-Message-State: AOAM532dhdvfzNpyNyawb92bgWKjo4ZvLYz4jjQHDhdnEOpF2dgC0XHB 5w8elC2zfkUNWVxHO3q6RWXou3PbECw= X-Google-Smtp-Source: ABdhPJwe6PRoNGGuD7vbd/1DhHrfcN5EnqrVEOomaJBSf0k5t4M4mWUACg5ubnkcLAOP3GFjq3bBRFVDrDI= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a25:b392:: with SMTP id m18mr15342081ybj.37.1642208863584; Fri, 14 Jan 2022 17:07:43 -0800 (PST) Date: Fri, 14 Jan 2022 17:06:00 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-3-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 2/6] cgroup: gpu: Add a cgroup controller for allocator attribution of GPU memory From: Hridya Valsaraju To: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Rodrigo Vivi , Matthew Auld , Matthew Brost , Li Li , Marco Ballesio , Hang Lu , Wedson Almeida Filho , Masahiro Yamada , Andrew Morton , Nathan Chancellor , Kees Cook , Nick Desaulniers , Miguel Ojeda , Vipin Sharma , Chris Down , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org The cgroup controller provides accounting for GPU and GPU-related memory allocations. The memory being accounted can be device memory or memory allocated from pools dedicated to serve GPU-related tasks. This patch adds APIs to: -allow a device to register for memory accounting using the GPU cgroup controller. -charge and uncharge allocated memory to a cgroup. When the cgroup controller is enabled, it would expose information about the memory allocated by each device(registered for GPU cgroup memory accounting) for each cgroup. The API/UAPI can be extended to set per-device/total allocation limits in the future. The cgroup controller has been named following the discussion in [1]. [1]: https://lore.kernel.org/amd-gfx/YCJp%2F%2FkMC7YjVMXv@phenom.ffwll.local/ Signed-off-by: Hridya Valsaraju --- include/linux/cgroup_gpu.h | 120 +++++++++++++ include/linux/cgroup_subsys.h | 4 + init/Kconfig | 7 + kernel/cgroup/Makefile | 1 + kernel/cgroup/gpu.c | 305 ++++++++++++++++++++++++++++++++++ 5 files changed, 437 insertions(+) create mode 100644 include/linux/cgroup_gpu.h create mode 100644 kernel/cgroup/gpu.c diff --git a/include/linux/cgroup_gpu.h b/include/linux/cgroup_gpu.h new file mode 100644 index 000000000000..0ac303ce6179 --- /dev/null +++ b/include/linux/cgroup_gpu.h @@ -0,0 +1,120 @@ +/* SPDX-License-Identifier: MIT + * Copyright 2019 Advanced Micro Devices, Inc. + * Copyright (C) 2022 Google LLC. + */ +#ifndef _CGROUP_GPU_H +#define _CGROUP_GPU_H + +#include +#include + +#ifdef CONFIG_CGROUP_GPU + /* The GPU cgroup controller data structure */ +struct gpucg { + struct cgroup_subsys_state css; + /* list of all resource pools that belong to this cgroup */ + struct list_head rpools; +}; + +struct gpucg_device { + /* + * list of various resource pool in various cgroups that the device is + * part of. + */ + struct list_head rpools; + /* list of all devices registered for GPU cgroup accounting */ + struct list_head dev_node; + /* + * pointer to string literal to be used as identifier for accounting and + * limit setting + */ + const char *name; +}; + +/** + * css_to_gpucg - get the corresponding gpucg ref from a cgroup_subsys_state + * @css: the target cgroup_subsys_state + * + * Returns: gpu cgroup that contains the @css + */ +static inline struct gpucg *css_to_gpucg(struct cgroup_subsys_state *css) +{ + return css ? container_of(css, struct gpucg, css) : NULL; +} + +/** + * gpucg_get - get the gpucg reference that a task belongs to + * @task: the target task + * + * This increases the reference count of the css that the @task belongs to. + * + * Returns: reference to the gpu cgroup the task belongs to. + */ +static inline struct gpucg *gpucg_get(struct task_struct *task) +{ + if (!cgroup_subsys_enabled(gpu_cgrp_subsys)) + return NULL; + return css_to_gpucg(task_get_css(task, gpu_cgrp_id)); +} + +/** + * gpucg_put - put a gpucg reference + * @gpucg: the target gpucg + * + * Put a reference obtained via gpucg_get + */ +static inline void gpucg_put(struct gpucg *gpucg) +{ + if (gpucg) + css_put(&gpucg->css); +} + +/** + * gpucg_parent - find the parent of a gpu cgroup + * @cg: the target gpucg + * + * This does not increase the reference count of the parent cgroup + * + * Returns: parent gpu cgroup of @cg + */ +static inline struct gpucg *gpucg_parent(struct gpucg *cg) +{ + return css_to_gpucg(cg->css.parent); +} + +int gpucg_try_charge(struct gpucg *gpucg, struct gpucg_device *device, u64 usage); +void gpucg_uncharge(struct gpucg *gpucg, struct gpucg_device *device, u64 usage); +void gpucg_register_device(struct gpucg_device *gpucg_dev, const char *name); +#else /* CONFIG_CGROUP_GPU */ + +struct gpucg; +struct gpucg_device; + +static inline struct gpucg *css_to_gpucg(struct cgroup_subsys_state *css) +{ + return NULL; +} + +static inline struct gpucg *gpucg_get(struct task_struct *task) +{ + return NULL; +} + +static inline void gpucg_put(struct gpucg *gpucg) {} + +static inline struct gpucg *gpucg_parent(struct gpucg *cg) +{ + return NULL; +} +static inline int gpucg_try_charge(struct gpucg *gpucg, struct gpucg_device *device, + u64 usage) +{ + return 0; +} + +static inline void gpucg_uncharge(struct gpucg *gpucg, struct gpucg_device *device, + u64 usage) {} +static inline void gpucg_register_device(struct gpucg_device *gpucg_dev, + const char *name) {} +#endif /* CONFIG_CGROUP_GPU */ +#endif /* _CGROUP_GPU_H */ diff --git a/include/linux/cgroup_subsys.h b/include/linux/cgroup_subsys.h index 445235487230..46a2a7b93c41 100644 --- a/include/linux/cgroup_subsys.h +++ b/include/linux/cgroup_subsys.h @@ -65,6 +65,10 @@ SUBSYS(rdma) SUBSYS(misc) #endif +#if IS_ENABLED(CONFIG_CGROUP_GPU) +SUBSYS(gpu) +#endif + /* * The following subsystems are not supported on the default hierarchy. */ diff --git a/init/Kconfig b/init/Kconfig index cd23faa163d1..408910b21387 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -990,6 +990,13 @@ config BLK_CGROUP See Documentation/admin-guide/cgroup-v1/blkio-controller.rst for more information. +config CGROUP_GPU + bool "gpu cgroup controller (EXPERIMENTAL)" + select PAGE_COUNTER + help + Provides accounting and limit setting for memory allocations by the GPU + and GPU-related subsystems. + config CGROUP_WRITEBACK bool depends on MEMCG && BLK_CGROUP diff --git a/kernel/cgroup/Makefile b/kernel/cgroup/Makefile index 12f8457ad1f9..be95a5a532fc 100644 --- a/kernel/cgroup/Makefile +++ b/kernel/cgroup/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_CGROUP_RDMA) += rdma.o obj-$(CONFIG_CPUSETS) += cpuset.o obj-$(CONFIG_CGROUP_MISC) += misc.o obj-$(CONFIG_CGROUP_DEBUG) += debug.o +obj-$(CONFIG_CGROUP_GPU) += gpu.o diff --git a/kernel/cgroup/gpu.c b/kernel/cgroup/gpu.c new file mode 100644 index 000000000000..b171fae06b0d --- /dev/null +++ b/kernel/cgroup/gpu.c @@ -0,0 +1,305 @@ +// SPDX-License-Identifier: MIT +// Copyright 2019 Advanced Micro Devices, Inc. +// Copyright (C) 2022 Google LLC. + +#include +#include +#include +#include +#include + +static struct gpucg *root_gpucg __read_mostly; + +/* + * Protects list of resource pools maintained on per cgroup basis + * and list of devices registered for memory accounting using the GPU cgroup + * controller. + */ +static DEFINE_MUTEX(gpucg_mutex); +static LIST_HEAD(gpucg_devices); + +struct gpucg_resource_pool { + /* The device whose resource usage is tracked by this resource pool */ + struct gpucg_device *device; + + /* list of all resource pools for the cgroup */ + struct list_head cg_node; + + /* + * list maintained by the gpucg_device to keep track of its + * resource pools + */ + struct list_head dev_node; + + /* tracks memory usage of the resource pool */ + struct page_counter total; +}; + +static void free_cg_rpool_locked(struct gpucg_resource_pool *rpool) +{ + lockdep_assert_held(&gpucg_mutex); + + list_del(&rpool->cg_node); + list_del(&rpool->dev_node); + kfree(rpool); +} + +static void gpucg_css_free(struct cgroup_subsys_state *css) +{ + struct gpucg_resource_pool *rpool, *tmp; + struct gpucg *gpucg = css_to_gpucg(css); + + // delete all resource pools + mutex_lock(&gpucg_mutex); + list_for_each_entry_safe(rpool, tmp, &gpucg->rpools, cg_node) + free_cg_rpool_locked(rpool); + mutex_unlock(&gpucg_mutex); + + kfree(gpucg); +} + +static struct cgroup_subsys_state * +gpucg_css_alloc(struct cgroup_subsys_state *parent_css) +{ + struct gpucg *gpucg, *parent; + + gpucg = kzalloc(sizeof(struct gpucg), GFP_KERNEL); + if (!gpucg) + return ERR_PTR(-ENOMEM); + + parent = css_to_gpucg(parent_css); + if (!parent) + root_gpucg = gpucg; + + INIT_LIST_HEAD(&gpucg->rpools); + + return &gpucg->css; +} + +static struct gpucg_resource_pool *find_cg_rpool_locked(struct gpucg *cg, + struct gpucg_device *device) + +{ + struct gpucg_resource_pool *pool; + + lockdep_assert_held(&gpucg_mutex); + + list_for_each_entry(pool, &cg->rpools, cg_node) + if (pool->device == device) + return pool; + + return NULL; +} + +static struct gpucg_resource_pool *init_cg_rpool(struct gpucg *cg, + struct gpucg_device *device) +{ + struct gpucg_resource_pool *rpool = kzalloc(sizeof(*rpool), + GFP_KERNEL); + if (!rpool) + return ERR_PTR(-ENOMEM); + + rpool->device = device; + + page_counter_init(&rpool->total, NULL); + INIT_LIST_HEAD(&rpool->cg_node); + INIT_LIST_HEAD(&rpool->dev_node); + list_add_tail(&rpool->cg_node, &cg->rpools); + list_add_tail(&rpool->dev_node, &device->rpools); + + return rpool; +} + +/** + * get_cg_rpool_locked - find the resource pool for the specified device and + * specified cgroup. If the resource pool does not exist for the cg, it is created + * in a hierarchical manner in the cgroup and its ancestor cgroups who do not + * already have a resource pool entry for the device. + * + * @cg: The cgroup to find the resource pool for. + * @device: The device associated with the returned resource pool. + * + * Return: return resource pool entry corresponding to the specified device in + * the specified cgroup (hierarchically creating them if not existing already). + * + */ +static struct gpucg_resource_pool * +get_cg_rpool_locked(struct gpucg *cg, struct gpucg_device *device) +{ + struct gpucg *parent_cg, *p, *stop_cg; + struct gpucg_resource_pool *rpool, *tmp_rpool; + struct gpucg_resource_pool *parent_rpool = NULL, *leaf_rpool = NULL; + + rpool = find_cg_rpool_locked(cg, device); + if (rpool) + return rpool; + + stop_cg = cg; + do { + rpool = init_cg_rpool(stop_cg, device); + if (IS_ERR(rpool)) + goto err; + + if (!leaf_rpool) + leaf_rpool = rpool; + + stop_cg = gpucg_parent(stop_cg); + if (!stop_cg) + break; + + rpool = find_cg_rpool_locked(stop_cg, device); + } while (!rpool); + + /* + * Re-initialize page counters of all rpools created in this invocation to + * enable hierarchical charging. + * stop_cg is the first ancestor cg who already had a resource pool for + * the device. It can also be NULL if no ancestors had a pre-existing + * resource pool for the device before this invocation. + */ + rpool = leaf_rpool; + for (p = cg; p != stop_cg; p = parent_cg) { + parent_cg = gpucg_parent(p); + if (!parent_cg) + break; + parent_rpool = find_cg_rpool_locked(parent_cg, device); + page_counter_init(&rpool->total, &parent_rpool->total); + + rpool = parent_rpool; + } + + return leaf_rpool; +err: + for (p = cg; p != stop_cg; p = gpucg_parent(p)) { + tmp_rpool = find_cg_rpool_locked(p, device); + free_cg_rpool_locked(tmp_rpool); + } + return rpool; +} + +/** + * gpucg_try_charge - charge memory to the specified gpucg and gpucg_device. + * Caller must hold a reference to @gpucg obtained through gpucg_get(). The size + * of the memory is rounded up to be a multiple of the page size. + * + * @gpucg: The gpu cgroup to charge the memory to. + * @device: The device to charge the memory to. + * @usage: size of memory to charge in bytes. + * + * Return: returns 0 if the charging is successful and otherwise returns an + * error code. + */ +int gpucg_try_charge(struct gpucg *gpucg, struct gpucg_device *device, u64 usage) +{ + struct page_counter *counter; + u64 nr_pages; + struct gpucg_resource_pool *rp; + int ret = 0; + + mutex_lock(&gpucg_mutex); + rp = get_cg_rpool_locked(gpucg, device); + /* + * gpucg_mutex can be unlocked here, rp will stay valid until gpucg is + * freed and the caller is holding a reference to the gpucg. + */ + mutex_unlock(&gpucg_mutex); + + if (IS_ERR(rp)) + return PTR_ERR(rp); + + nr_pages = PAGE_ALIGN(usage) >> PAGE_SHIFT; + if (page_counter_try_charge(&rp->total, nr_pages, + &counter)) + css_get_many(&gpucg->css, nr_pages); + else + ret = -ENOMEM; + + return ret; +} + +/** + * gpucg_uncharge - uncharge memory from the specified gpucg and gpucg_device. + * The caller must hold a reference to @gpucg obtained through gpucg_get(). + * + * @gpucg: The gpu cgroup to uncharge the memory from. + * @device: The device to uncharge the memory from. + * @usage: size of memory to uncharge in bytes. + */ +void gpucg_uncharge(struct gpucg *gpucg, struct gpucg_device *device, + u64 usage) +{ + u64 nr_pages; + struct gpucg_resource_pool *rp; + + mutex_lock(&gpucg_mutex); + rp = find_cg_rpool_locked(gpucg, device); + /* + * gpucg_mutex can be unlocked here, rp will stay valid until gpucg is + * freed and there are active refs on gpucg. + */ + mutex_unlock(&gpucg_mutex); + + if (unlikely(!rp)) { + pr_err("Resource pool not found, incorrect charge/uncharge ordering?\n"); + return; + } + + nr_pages = PAGE_ALIGN(usage) >> PAGE_SHIFT; + page_counter_uncharge(&rp->total, nr_pages); + css_put_many(&gpucg->css, nr_pages); +} + +/** + * gpucg_register_device - Registers a device for memory accounting using the + * GPU cgroup controller. + * + * @device: The device to register for memory accounting. + * @name: Pointer to a string literal to denote the name of the device. + * + * Both @device andd @name must remain valid. + */ +void gpucg_register_device(struct gpucg_device *device, const char *name) +{ + if (!device) + return; + + INIT_LIST_HEAD(&device->dev_node); + INIT_LIST_HEAD(&device->rpools); + + mutex_lock(&gpucg_mutex); + list_add_tail(&device->dev_node, &gpucg_devices); + mutex_unlock(&gpucg_mutex); + + device->name = name; +} + +static int gpucg_resource_show(struct seq_file *sf, void *v) +{ + struct gpucg_resource_pool *rpool; + struct gpucg *cg = css_to_gpucg(seq_css(sf)); + + mutex_lock(&gpucg_mutex); + list_for_each_entry(rpool, &cg->rpools, cg_node) { + seq_printf(sf, "%s %lu\n", rpool->device->name, + page_counter_read(&rpool->total) * PAGE_SIZE); + } + mutex_unlock(&gpucg_mutex); + + return 0; +} + +struct cftype files[] = { + { + .name = "memory.current", + .seq_show = gpucg_resource_show, + }, + { } /* terminate */ +}; + +struct cgroup_subsys gpu_cgrp_subsys = { + .css_alloc = gpucg_css_alloc, + .css_free = gpucg_css_free, + .early_init = false, + .legacy_cftypes = files, + .dfl_cftypes = files, +}; From patchwork Sat Jan 15 01:06:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE577C433FE for ; Sat, 15 Jan 2022 01:08:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231804AbiAOBIG (ORCPT ); Fri, 14 Jan 2022 20:08:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55056 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231767AbiAOBIF (ORCPT ); Fri, 14 Jan 2022 20:08:05 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91F80C06173F for ; Fri, 14 Jan 2022 17:08:05 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id p135-20020a25748d000000b00611f5308717so3803319ybc.2 for ; Fri, 14 Jan 2022 17:08:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=LfOWpUXhBkdNMBSdLOLDySgcwCzbXXwP1iBJEDhZFv8=; b=J5mPbGLt7ug0i6rUOlPQB0SbUQWFJSgsvFcHqlpnRStYUmz1v8nAst5BH8KdUKH+E5 Y+y+ri5cPm3Uc/11fNmg6U++8GpkzZIL7XzsjexToSq8fHDkgii6o4uCbMapylVz/zEG 3m9pF8AyP4tGEI6jLNHgoo9nrfidCLz5OBW+Y7Zf+hWlmtr7PKpPe6JO8A2jObfel0K5 vtHp9sm9kp4trwusYDUnrCA+WhVBojy0HqwJMYa1rlb4GwpGlQwdtwLnm2rfqYKTjW20 BG0Wzk29UTQ44uAuhbMgF+LJ2TfXg8DiW2p23TbKNWUdiFe4hDC7YPdZpGFY11QQM6RU Gpkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=LfOWpUXhBkdNMBSdLOLDySgcwCzbXXwP1iBJEDhZFv8=; b=cR+QZFP8Iyq32S92n2Fjg5Z564LsXNRt/gRbHX1AC1MKbTJ9vzG3TrGcNvZxgs2iRj vGIX7up67Xgp6Uj83aIoqY2fXTJbSwmPvTdW3Lm7vgzs4ZqV3qIw7gW+nX9zN/1gw1ep FyNHQzBuVv3RiPnHM91K1oMuuztLR23E/CSLlZzTkXYiqVs4/RRL5vYmkRxLS/DtoBQR E9oogB8j1M6dLIYhCEhyz1hRTSYBW+Ma9HR22bUtZhi8PYLRHjYbbnP3UUm8LXbFzXdG YtcT75V6EQb5zvgfrbEKbSa6VJkBfGrcrzcuFY9ijQ782xdrCVeQrZdUpdm2ivuVwdh5 Pesg== X-Gm-Message-State: AOAM530VrxNA9Vd41JbCE4NQpWsiU/+SKhU3rxQKChsIt26MFCTdU1sI dHowuDNHTeYQoN4iWuIaHd+C39XqXJA= X-Google-Smtp-Source: ABdhPJxEo/SHkh9fjhSgvj8DmROG2DlBqP+iV92JJogPD/EIt81pQMWUSFVVVrqMuvuOzRUehku30plW1rY= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a25:d305:: with SMTP id e5mr8057117ybf.182.1642208884729; Fri, 14 Jan 2022 17:08:04 -0800 (PST) Date: Fri, 14 Jan 2022 17:06:01 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-4-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 3/6] dmabuf: heaps: Use the GPU cgroup charge/uncharge APIs From: Hridya Valsaraju To: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Matthew Auld , Jason Ekstrand , Jon Bloomfield , Matthew Brost , Li Li , Marco Ballesio , Wedson Almeida Filho , Hang Lu , Masahiro Yamada , Andrew Morton , Nathan Chancellor , Kees Cook , Nick Desaulniers , Miguel Ojeda , Vipin Sharma , Chris Down , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org This patch uses the GPU cgroup charge/uncharge APIs to charge buffers allocated by the DMA-BUF system heap to the processes who allocated them. By doing so, it becomes possible to track who allocated/exported a DMA-BUF even after the allocating process drops all references to a buffer. Signed-off-by: Hridya Valsaraju --- drivers/dma-buf/dma-heap.c | 27 +++++++++++++++++++++++++++ drivers/dma-buf/heaps/system_heap.c | 25 +++++++++++++++++++++++++ include/linux/dma-heap.h | 11 +++++++++++ 3 files changed, 63 insertions(+) diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c index 56bf5ad01ad5..6e74690f4b83 100644 --- a/drivers/dma-buf/dma-heap.c +++ b/drivers/dma-buf/dma-heap.c @@ -6,6 +6,7 @@ * Copyright (C) 2019 Linaro Ltd. */ +#include #include #include #include @@ -30,6 +31,7 @@ * @heap_devt heap device node * @list list head connecting to list of heaps * @heap_cdev heap char device + * @gpucg_dev gpu cg device for memory accounting * * Represents a heap of memory from which buffers can be made. */ @@ -40,6 +42,9 @@ struct dma_heap { dev_t heap_devt; struct list_head list; struct cdev heap_cdev; +#ifdef CONFIG_CGROUP_GPU + struct gpucg_device gpucg_dev; +#endif }; static LIST_HEAD(heap_list); @@ -214,6 +219,26 @@ const char *dma_heap_get_name(struct dma_heap *heap) return heap->name; } +#ifdef CONFIG_CGROUP_GPU +/** + * dma_heap_get_gpucg_dev() - get struct gpucg_device for the heap. + * @heap: DMA-Heap to get the gpucg_device struct for. + * + * Returns: + * The gpucg_device struct for the heap. NULL if the GPU cgroup controller is + * not enabled. + */ +struct gpucg_device *dma_heap_get_gpucg_dev(struct dma_heap *heap) +{ + return &heap->gpucg_dev; +} +#else +struct gpucg_device *dma_heap_get_gpucg_dev(struct dma_heap *heap) +{ + return NULL; +} +#endif + struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info) { struct dma_heap *heap, *h, *err_ret; @@ -286,6 +311,8 @@ struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info) list_add(&heap->list, &heap_list); mutex_unlock(&heap_list_lock); + gpucg_register_device(dma_heap_get_gpucg_dev(heap), exp_info->name); + return heap; err2: diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index ab7fd896d2c4..adfdc8c576f2 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -31,6 +31,7 @@ struct system_heap_buffer { struct sg_table sg_table; int vmap_cnt; void *vaddr; + struct gpucg *gpucg; }; struct dma_heap_attachment { @@ -296,6 +297,13 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf) __free_pages(page, compound_order(page)); } sg_free_table(table); + + gpucg_uncharge(buffer->gpucg, + dma_heap_get_gpucg_dev(buffer->heap), + buffer->len); + + gpucg_put(buffer->gpucg); + kfree(buffer); } @@ -356,6 +364,16 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, mutex_init(&buffer->lock); buffer->heap = heap; buffer->len = len; + buffer->gpucg = gpucg_get(current); + + ret = gpucg_try_charge(buffer->gpucg, + dma_heap_get_gpucg_dev(buffer->heap), + len); + if (ret) { + gpucg_put(buffer->gpucg); + kfree(buffer); + return ERR_PTR(ret); + } INIT_LIST_HEAD(&pages); i = 0; @@ -413,6 +431,13 @@ static struct dma_buf *system_heap_allocate(struct dma_heap *heap, free_buffer: list_for_each_entry_safe(page, tmp_page, &pages, lru) __free_pages(page, compound_order(page)); + + gpucg_uncharge(buffer->gpucg, + dma_heap_get_gpucg_dev(buffer->heap), + buffer->len); + + gpucg_put(buffer->gpucg); + kfree(buffer); return ERR_PTR(ret); diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h index 0c05561cad6e..e447a61d054e 100644 --- a/include/linux/dma-heap.h +++ b/include/linux/dma-heap.h @@ -10,6 +10,7 @@ #define _DMA_HEAPS_H #include +#include #include struct dma_heap; @@ -59,6 +60,16 @@ void *dma_heap_get_drvdata(struct dma_heap *heap); */ const char *dma_heap_get_name(struct dma_heap *heap); +/** + * dma_heap_get_gpucg_dev() - get a pointer to the struct gpucg_device for the + * heap. + * @heap: DMA-Heap to retrieve gpucg_device for. + * + * Returns: + * The gpucg_device struct for the heap. + */ +struct gpucg_device *dma_heap_get_gpucg_dev(struct dma_heap *heap); + /** * dma_heap_add - adds a heap to dmabuf heaps * @exp_info: information needed to register this heap From patchwork Sat Jan 15 01:06:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532770 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E598C433F5 for ; Sat, 15 Jan 2022 01:08:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231824AbiAOBI1 (ORCPT ); Fri, 14 Jan 2022 20:08:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231814AbiAOBI0 (ORCPT ); Fri, 14 Jan 2022 20:08:26 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B062AC061574 for ; Fri, 14 Jan 2022 17:08:26 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id b186-20020a25cbc3000000b00611b032ccadso17033618ybg.16 for ; Fri, 14 Jan 2022 17:08:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=sWrCFDJqqXXVSbGVP8s44HP9fGoTeRrZv/20U3l5a9g=; b=HsuhTKzXI1C0Bg5qlXVEUKgy0Y59ylb/5klYriv4XRxp76HIR6zzyRrTkNqoZeAa/+ b49JcXDqX9e/xJ/SUE/3bVyoaZ0lg1xpEMerwoqfVEOSdzj9xLfLi1s1GhQ+IDW9v6/g 5GED0tKyYGSjJ2/1/kebwAkX3iCP3oywer4GN/EkL+RsKPE2U2Hk2kgIljMzFb8dnS5Y FqX9cDLP+EMxfP4f35RPYTdaQONdRKxIF2yg0spELgieeV12+CV7Uxpttxe+nF4d6tb3 BSX5dX+1OeDTvZq4IqC6x7NYZPz51nGacX/FtiiRJ2MyzFoQX3zRra2Gfbo1QjE20yf4 5vjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=sWrCFDJqqXXVSbGVP8s44HP9fGoTeRrZv/20U3l5a9g=; b=CUIAcc6PxsWBNQu9ciNcXhNWLKfJbrHq0rawbGU9d+NnX80eAQpRxAINQTiHZiSfpL +vvIuJw5Ywwhc5MjL1UosgqrUSh4FGhVG2lKBC1WP23Pez0793uJYPEgvU48ggAKSPBx oCnhzWtHKAjKq1QqOujqBIQBruh+lpL3J09RX4XwaJTOOyt6SFpZYXP32vKymT9C+FcJ 4MafXbyGP/t2D7oxwJxuWeybzErH4BdsY0jZwZzooUui9NNCim/xnABqNaMocee0Gyce T2Af/ruo+L4t+jc6Zokao3Q0xAC8lXUjsMeJy5F2QffyCE3ko0kWy6VJzw+dpT7KhZbW LBFA== X-Gm-Message-State: AOAM532UIM/zWD5eouGmc0WwpCQ1U2WwSzUro07d+aEQsYwEHnw7JfOW nbh0TOLHN82mh6J+s90ImIg9UOcrV2Y= X-Google-Smtp-Source: ABdhPJxrB6uVRtSUJNiCU0mlnPcpg+H8hNUC8zW0t5DkVGpaOEH4KKhobnZfY7Jxl05uHClUdyHKGaFNjYw= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a05:6902:723:: with SMTP id l3mr17660046ybt.378.1642208905843; Fri, 14 Jan 2022 17:08:25 -0800 (PST) Date: Fri, 14 Jan 2022 17:06:02 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-5-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 4/6] dma-buf: Add DMA-BUF exporter op to charge a DMA-BUF to a cgroup. From: Hridya Valsaraju To: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Jason Ekstrand , Matthew Auld , Matthew Brost , Li Li , Marco Ballesio , Miguel Ojeda , Hang Lu , Wedson Almeida Filho , Masahiro Yamada , Andrew Morton , Nathan Chancellor , Kees Cook , Nick Desaulniers , Chris Down , Vipin Sharma , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org The optional exporter op provides a way for processes to transfer charge of a buffer to a different process. This is essential for the cases where a central allocator process does allocations for various subsystems, hands over the fd to the client who requested the memory and drops all references to the allocated memory. Signed-off-by: Hridya Valsaraju --- include/linux/dma-buf.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 7ab50076e7a6..d5e52f81cc6f 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -13,6 +13,7 @@ #ifndef __DMA_BUF_H__ #define __DMA_BUF_H__ +#include #include #include #include @@ -285,6 +286,23 @@ struct dma_buf_ops { int (*vmap)(struct dma_buf *dmabuf, struct dma_buf_map *map); void (*vunmap)(struct dma_buf *dmabuf, struct dma_buf_map *map); + + /** + * @charge_to_cgroup: + * + * This is called by an exporter to charge a buffer to the specified + * cgroup. The caller must hold a reference to @gpucg obtained via + * gpucg_get(). The DMA-BUF will be uncharged from the cgroup it is + * currently charged to before being charged to @gpucg. The caller must + * belong to the cgroup the buffer is currently charged to. + * + * This callback is optional. + * + * Returns: + * + * 0 on success or negative error code on failure. + */ + int (*charge_to_cgroup)(struct dma_buf *dmabuf, struct gpucg *gpucg); }; /** From patchwork Sat Jan 15 01:06:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54498C433EF for ; Sat, 15 Jan 2022 01:08:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231852AbiAOBIv (ORCPT ); Fri, 14 Jan 2022 20:08:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231848AbiAOBIr (ORCPT ); Fri, 14 Jan 2022 20:08:47 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2EEEC06173F for ; Fri, 14 Jan 2022 17:08:47 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id s89-20020a25aa62000000b00611afc92630so17453515ybi.17 for ; Fri, 14 Jan 2022 17:08:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=ti5KO9A8vk9sjDtc4QE21jT0NFR0GaV/WLBKcRARKgI=; b=S7eWa/B0tO4XbPv0OdCk5n8qWLZPt/5gRI1DTJwTi5GhQHBW/2q4S1Goa6HVJ53kOV C4SDzmhRcejlHxG15TPB2RCHg/RE9n0Klhhmd91JlvLbggw+Zomcqm6xLZmX5+VZmd5v 6z2PEUpw/n9msL7glrfIs7RCavPk8+6AWV+iux/DG1QvtvTlcz7OoEHlREsv93KpGfY7 7PequsOl0jKjwIIwYcy3wUNuWKC7exUsRjvXFMuBU8C8z0cNytD8ojAHfXP8uaBhODyO ecTIbTrob/3fDmTXsmgbfls5PaDFeVT7ZAhbWGu7Ln8w+tk/R5tS0MQPM33+aFxrGILz 73Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ti5KO9A8vk9sjDtc4QE21jT0NFR0GaV/WLBKcRARKgI=; b=bAtyh2yXdXe2jZ+3e9d4ykDxtY5z9Y8vqrWPKHdH1M7180jvw7k+2AR5sZ2iS4gPfs eESeEmS8q7JE4AjSB2TQaLitPi/MH0r167BjQXD4fFSQwycd5FVOLPPK94po1VDmnbM4 942ACe3yzOmN0v3oS7W0MFajfjoXrdSjnxEaHlwhRm7mizJDngeCA/0M0aeuz2aZKzBe y6sK2vV9dvbDSQiej0V2wcOB5v+jImhpMFTo78AOJRnddMOFHGB4FyZBnSb908Oi9sCv H6ZPMbRB7NGu6uXNa5fI4yQ0JMY/8RBE85dwxi3H8adqkYauG+0iJylwqosXhYoHbaqK /WRA== X-Gm-Message-State: AOAM533/OZTAI28krNkDnSva/q0tugeZZ/5CfPa3iHX1Grnb/UbPagV5 ZOOQ2WEY/qUhhYqFss2V+TjM2Wmflew= X-Google-Smtp-Source: ABdhPJyx2A41ANkEXxjx9b22jDn5GMaVfTqhW/nKH2++Oa46UOTgyGXsF/oz0GqIRyot11Vvd63nfkZv34o= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a05:6902:286:: with SMTP id v6mr15557991ybh.569.1642208926921; Fri, 14 Jan 2022 17:08:46 -0800 (PST) Date: Fri, 14 Jan 2022 17:06:03 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-6-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 5/6] dmabuf: system_heap: implement dma-buf op for GPU cgroup charge transfer From: Hridya Valsaraju To: Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Daniel Vetter , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Kenneth Graunke , Rodrigo Vivi , Matthew Brost , Matthew Auld , Li Li , Marco Ballesio , Miguel Ojeda , Hang Lu , Wedson Almeida Filho , Masahiro Yamada , Andrew Morton , Nathan Chancellor , Kees Cook , Nick Desaulniers , Chris Down , Vipin Sharma , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org The DMA-BUF op can be invoked when a process that allocated a buffer relinquishes its ownership and passes it over to another process. Signed-off-by: Hridya Valsaraju --- drivers/dma-buf/heaps/system_heap.c | 43 +++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index adfdc8c576f2..70f5b98f1157 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -307,6 +307,48 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf) kfree(buffer); } +#ifdef CONFIG_CGROUP_GPU +static int system_heap_dma_buf_charge(struct dma_buf *dmabuf, struct gpucg *gpucg) +{ + struct gpucg *current_gpucg; + struct gpucg_device *gpucg_dev; + struct system_heap_buffer *buffer = dmabuf->priv; + size_t len = buffer->len; + int ret = 0; + + /* + * Check that the process requesting the transfer is the same as the one + * to whom the buffer is currently charged to. + */ + current_gpucg = gpucg_get(current); + if (current_gpucg != buffer->gpucg) + ret = -EPERM; + + gpucg_put(current_gpucg); + if (ret) + return ret; + + gpucg_dev = dma_heap_get_gpucg_dev(buffer->heap); + + ret = gpucg_try_charge(gpucg, gpucg_dev, len); + if (ret) + return ret; + + /* uncharge the buffer from the cgroup its currently charged to. */ + gpucg_uncharge(buffer->gpucg, gpucg_dev, buffer->len); + gpucg_put(buffer->gpucg); + + buffer->gpucg = gpucg; + + return 0; +} +#else +static int system_heap_dma_buf_charge(struct dma_buf *dmabuf, struct gpucg *gpucg) +{ + return 0; +} +#endif + static const struct dma_buf_ops system_heap_buf_ops = { .attach = system_heap_attach, .detach = system_heap_detach, @@ -318,6 +360,7 @@ static const struct dma_buf_ops system_heap_buf_ops = { .vmap = system_heap_vmap, .vunmap = system_heap_vunmap, .release = system_heap_dma_buf_release, + .charge_to_cgroup = system_heap_dma_buf_charge, }; static struct page *alloc_largest_available(unsigned long size, From patchwork Sat Jan 15 01:06:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hridya Valsaraju X-Patchwork-Id: 532769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4612C4332F for ; Sat, 15 Jan 2022 01:09:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231825AbiAOBJK (ORCPT ); Fri, 14 Jan 2022 20:09:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229835AbiAOBJJ (ORCPT ); Fri, 14 Jan 2022 20:09:09 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E019AC061574 for ; Fri, 14 Jan 2022 17:09:08 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id y13-20020a25ad0d000000b00611e6e08abbso6016720ybi.10 for ; Fri, 14 Jan 2022 17:09:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=kkjQvBHUpUsPJOcuRnb4FbpqpDLYZFNC/PGNH3Zg5WU=; b=B7HrMalKbOrQ0jLo+t2k5LKQyXsYTrKPSnoCQbJTFt3xXAdNgf+Y3bG0OwuuV3conm 9YDTffSi81yddsv2BoH3Yepu0Qf4t9Yufib51okExGf5c00BzhktLBE/1fgctRjU17A0 FE5Q2yG/2zZsbcQO5KIgRMuIdtuJb3ntKNdMvUrJDIlln1wNgH3uN1Xs3Tq4HUuo2nQw vX/4a2fLq5zN8JuJ/ROXboM4iqrNUfkxyQzzeG4K6rLmd0dL8lZlsn2U8cU340A6r7rt CrzHH2yHQihIsNpJ9nYxK0RZUUVox7TiNHvKS/20HXPEnTeOhUMt/SoHjy/YrqoHfDGB 975A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=kkjQvBHUpUsPJOcuRnb4FbpqpDLYZFNC/PGNH3Zg5WU=; b=NXSQuw05G5t0/9M9rll4VIJupIF4JZd8KnjxTOVvSwvt4dbfH7qZNkunU+N/f83OdZ 6OSEyYjdkUcVK+7NhNDAzZt1iwrnNATAfhx6oRGfIOFlz/3Svqmaetk+1cL2EyMqYRES G3OztGpUzxDt9oeu8FTYLPBgj47qKD/OcFl5wvnPKQ2ECWuRCKYrr+mbkeGqNNaq9e0L i/PI+wiRJN5sCKNFQNe8229OJVXcz6ZMWmy/8622/eJnDPbRxS3BFRYZcOxP/J70l8xM 0lziEcZPywC3w+yGwN9Q/Iu7gbWRu4W5/lCtADNHTWpbR6S/hfbGtIDE01Lym0ZT7unP cyVQ== X-Gm-Message-State: AOAM531No6WH5S/43zWX1+gpfjhJjqjoVC6eir4V4Px2gO2EXyrj91kN pRLjGhYHGc1XyTqoeCuvIewyGBbuIC8= X-Google-Smtp-Source: ABdhPJz9CR2IoAwevLCGCWJ5VPMm/mBeIOkcx7f8CHh3K/XQY53M+B3vhZstfGoP5vWHDyFMJE3GX6t7YyU= X-Received: from hridya.mtv.corp.google.com ([2620:15c:211:200:5860:362a:3112:9d85]) (user=hridya job=sendgmr) by 2002:a25:7b44:: with SMTP id w65mr15284933ybc.59.1642208948043; Fri, 14 Jan 2022 17:09:08 -0800 (PST) Date: Fri, 14 Jan 2022 17:06:04 -0800 In-Reply-To: <20220115010622.3185921-1-hridya@google.com> Message-Id: <20220115010622.3185921-7-hridya@google.com> Mime-Version: 1.0 References: <20220115010622.3185921-1-hridya@google.com> X-Mailer: git-send-email 2.34.1.703.g22d0c6ccf7-goog Subject: [RFC 6/6] android: binder: Add a buffer flag to relinquish ownership of fds From: Hridya Valsaraju To: David Airlie , Daniel Vetter , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Jonathan Corbet , Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj=C3=B8n?= =?utf-8?q?nev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Hridya Valsaraju , Suren Baghdasaryan , Sumit Semwal , Benjamin Gaignard , Liam Mark , Laura Abbott , Brian Starkey , John Stultz , " =?utf-8?q?Christian_K=C3=B6nig?= " , Tejun Heo , Zefan Li , Johannes Weiner , Dave Airlie , Kenneth Graunke , Jason Ekstrand , Matthew Auld , Matthew Brost , Li Li , Marco Ballesio , Hang Lu , Wedson Almeida Filho , Masahiro Yamada , Nathan Chancellor , Andrew Morton , Kees Cook , Nick Desaulniers , Miguel Ojeda , Chris Down , Vipin Sharma , Daniel Borkmann , Vlastimil Babka , Arnd Bergmann , dri-devel@lists.freedesktop.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, cgroups@vger.kernel.org Cc: Kenny.Ho@amd.com, daniels@collabora.com, kaleshsingh@google.com, tjmercier@google.com Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org This patch introduces a buffer flag BINDER_BUFFER_FLAG_SENDER_NO_NEED that a process sending an fd array to another process over binder IPC can set to relinquish ownership of the fds being sent for memory accounting purposes. If the flag is found to be set during the fd array translation and the fd is for a DMA-BUF, the buffer is uncharged from the sender's cgroup and charged to the receiving process's cgroup instead. It is upto the sending process to ensure that it closes the fds regardless of whether the transfer failed or succeeded. Most graphics shared memory allocations in Android are done by the graphics allocator HAL process. On requests from clients, the HAL process allocates memory and sends the fds to the clients over binder IPC. The graphics allocator HAL will not retain any references to the buffers. When the HAL sets the BINDER_BUFFER_FLAG_SENDER_NO_NEED for fd arrays holding DMA-BUF fds, the gpu cgroup controller will be able to correctly charge the buffers to the client processes instead of the graphics allocator HAL. Signed-off-by: Hridya Valsaraju --- drivers/android/binder.c | 32 +++++++++++++++++++++++++++++ include/uapi/linux/android/binder.h | 1 + 2 files changed, 33 insertions(+) diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 5497797ab258..83082fd1ab6a 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -42,6 +42,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include #include #include #include @@ -2482,8 +2483,11 @@ static int binder_translate_fd_array(struct list_head *pf_head, { binder_size_t fdi, fd_buf_size; binder_size_t fda_offset; + bool transfer_gpu_charge = false; const void __user *sender_ufda_base; struct binder_proc *proc = thread->proc; + struct binder_proc *target_proc = t->to_proc; + int ret; fd_buf_size = sizeof(u32) * fda->num_fds; @@ -2520,8 +2524,15 @@ static int binder_translate_fd_array(struct list_head *pf_head, if (ret) return ret; + if (IS_ENABLED(CONFIG_CGROUP_GPU) && + parent->flags & BINDER_BUFFER_FLAG_SENDER_NO_NEED) + transfer_gpu_charge = true; + for (fdi = 0; fdi < fda->num_fds; fdi++) { u32 fd; + struct dma_buf *dmabuf; + struct gpucg *gpucg; + binder_size_t offset = fda_offset + fdi * sizeof(fd); binder_size_t sender_uoffset = fdi * sizeof(fd); @@ -2531,6 +2542,27 @@ static int binder_translate_fd_array(struct list_head *pf_head, in_reply_to); if (ret) return ret > 0 ? -EINVAL : ret; + + if (!transfer_gpu_charge) + continue; + + dmabuf = dma_buf_get(fd); + if (IS_ERR(dmabuf)) + continue; + + if (dmabuf->ops->charge_to_cgroup) { + gpucg = gpucg_get(target_proc->tsk); + ret = dmabuf->ops->charge_to_cgroup(dmabuf, gpucg); + if (ret) { + pr_warn("%d:%d Unable to transfer DMA-BUF fd charge to %d", + proc->pid, thread->pid, target_proc->pid); + gpucg_put(gpucg); + } + } else { + pr_warn("%d:%d DMA-BUF exporter %s is not configured correctly for GPU cgroup memory accounting", + proc->pid, thread->pid, dmabuf->exp_name); + } + dma_buf_put(dmabuf); } return 0; } diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h index ad619623571e..c85f0014c341 100644 --- a/include/uapi/linux/android/binder.h +++ b/include/uapi/linux/android/binder.h @@ -137,6 +137,7 @@ struct binder_buffer_object { enum { BINDER_BUFFER_FLAG_HAS_PARENT = 0x01, + BINDER_BUFFER_FLAG_SENDER_NO_NEED = 0x02, }; /* struct binder_fd_array_object - object describing an array of fds in a buffer