From patchwork Wed Feb 23 15:20:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: SeongJae Park X-Patchwork-Id: 546045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 798D4C4167B for ; Wed, 23 Feb 2022 15:21:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241951AbiBWPWL (ORCPT ); Wed, 23 Feb 2022 10:22:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241923AbiBWPWJ (ORCPT ); Wed, 23 Feb 2022 10:22:09 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49C4713CDE; Wed, 23 Feb 2022 07:21:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 9DDF1B82083; Wed, 23 Feb 2022 15:21:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64C8CC340F1; Wed, 23 Feb 2022 15:21:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1645629694; bh=TNqodL7BjlulfrJ4TOMeVd1ntmcBKVGFqoceJimpmu4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fksZkN0baTh347CPhGL3izbFzUwiWiEyngm9VmeIKDe5txxf0ZpENeR0F5bxLmFad Pa5g8zOP1kc4WjR/CZKRbO3DCBWi1Q6ypiIHCvU36DjYJA6l3GGPnIbJztmtA06ZP0 92aMMI/WeozVE25hdMoiavHIVihaV5rTZHIDVTIM0EyZO9z3plr8wFL+cy8/bYYMDe jEuwnC01uhigCJCt1DW+c8IDwdDUrF0/35uYxhC1fr0ocmxRLGia+a5S5QRscKVhiE baT4spHtfBxBTobgdmSznTV7F6bONg9vwX/lR5tmc5CKuilcYuD1qroupXTdeoEvZf GvctLo7Zp2YNw== From: SeongJae Park To: akpm@linux-foundation.org Cc: corbet@lwn.net, skhan@linuxfoundation.org, rientjes@google.com, xhao@linux.alibaba.com, linux-damon@amazon.com, linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, SeongJae Park Subject: [PATCH 12/12] Docs/admin-guide/mm/damon/usage: Document DAMON sysfs interface Date: Wed, 23 Feb 2022 15:20:51 +0000 Message-Id: <20220223152051.22936-13-sj@kernel.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220223152051.22936-1-sj@kernel.org> References: <20220223152051.22936-1-sj@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org This commit adds detailed usage of DAMON sysfs interface in the admin-guide document for DAMON. Signed-off-by: SeongJae Park --- Documentation/admin-guide/mm/damon/usage.rst | 349 ++++++++++++++++++- 1 file changed, 343 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/mm/damon/usage.rst b/Documentation/admin-guide/mm/damon/usage.rst index b6ec650873b2..51cd3d531404 100644 --- a/Documentation/admin-guide/mm/damon/usage.rst +++ b/Documentation/admin-guide/mm/damon/usage.rst @@ -4,7 +4,7 @@ Detailed Usages =============== -DAMON provides below three interfaces for different users. +DAMON provides below interfaces for different users. - *DAMON user space tool.* `This `_ is for privileged people such as @@ -14,17 +14,21 @@ DAMON provides below three interfaces for different users. virtual and physical address spaces monitoring. For more detail, please refer to its `usage document `_. -- *debugfs interface.* - :ref:`This ` is for privileged user space programmers who +- *sysfs interface.* + :ref:`This ` is for privileged user space programmers who want more optimized use of DAMON. Using this, users can use DAMON’s major - features by reading from and writing to special debugfs files. Therefore, - you can write and use your personalized DAMON debugfs wrapper programs that - reads/writes the debugfs files instead of you. The `DAMON user space tool + features by reading from and writing to special sysfs files. Therefore, + you can write and use your personalized DAMON sysfs wrapper programs that + reads/writes the sysfs files instead of you. The `DAMON user space tool `_ is one example of such programs. It supports both virtual and physical address spaces monitoring. Note that this interface provides only simple :ref:`statistics ` for the monitoring results. For detailed monitoring results, DAMON provides a :ref:`tracepoint `. +- *debugfs interface.* + :ref:`This ` is almost identical to :ref:`sysfs interface + `. This will be removed after next LTS kernel is released, + so users should move to the :ref:`sysfs interface `. - *Kernel Space Programming Interface.* :doc:`This ` is for kernel space programmers. Using this, users can utilize every feature of DAMON most flexibly and efficiently by @@ -32,6 +36,339 @@ DAMON provides below three interfaces for different users. DAMON for various address spaces. For detail, please refer to the interface :doc:`document `. +.. _sysfs_interface: + +sysfs Interface +=============== + +DAMON sysfs interface is built when ``CONFIG_DAMON_SYSFS`` is defined. It +creates multiple directories and files under its sysfs directory, +``/kernel/mm/damon/``. You can control DAMON by writing to and reading +from the files under the directory. + +For a short example, users can monitor the virtual address space of a given +workload as below. :: + + # cd /sys/kernel/mm/damon/admin/ + # echo 1 > kdamonds/nr && echo 1 > kdamonds/0/contexts/nr + # echo vaddr > kdamonds/0/contexts/0/operations + # echo 1 > kdamonds/0/contexts/0/targets/nr + # echo $(pidof ) > kdamonds/0/contexts/0/targets/0/pid + # echo on > kdamonds/0/state + +Files Hierarchy +--------------- + +The files hierarchy of DAMON sysfs interface is shown below. In the below +figure, parents-children relations are represented with indentations, each +directory is having ``/`` suffix, and files in each directory are separated by +comma (","). :: + + /sys/kernel/mm/damon/admin + │ kdamonds/nr + │ │ 0/state,pid + │ │ │ contexts/nr + │ │ │ │ 0/operations + │ │ │ │ │ monitoring_attrs/ + │ │ │ │ │ │ intervals/sample_us,aggr_us,update_us + │ │ │ │ │ │ nr_regions/min,max + │ │ │ │ │ targets/nr + │ │ │ │ │ │ 0/pid + │ │ │ │ │ │ │ regions/nr + │ │ │ │ │ │ │ │ 0/start,end + │ │ │ │ │ │ │ │ ... + │ │ │ │ │ │ ... + │ │ │ │ │ schemes/nr + │ │ │ │ │ │ 0/action + │ │ │ │ │ │ │ access_pattern/ + │ │ │ │ │ │ │ │ sz/min,max + │ │ │ │ │ │ │ │ nr_accesses/min,max + │ │ │ │ │ │ │ │ age/min,max + │ │ │ │ │ │ │ quotas/ms,sz,reset_interval_ms + │ │ │ │ │ │ │ │ weights/sz,nr_accesses,age + │ │ │ │ │ │ │ watermarks/metric,interval_us,high,mid,low + │ │ │ │ │ │ │ stats/nr_tried,sz_tried,nr_applied,sz_applied,qt_exceeds + │ │ │ │ │ │ ... + │ │ │ │ ... + │ │ ... + +Root +---- + +The root of the DAMON sysfs interface is ``/kernel/mm/damon/``, and it +has one directory named ``admin``. The directory contains the files for +privileged user space programs' control of DAMON. User space tools or deamons +having the root permission could use this directory. + +kdamonds/ +--------- + +The monitoring-related information including request specifications and results +are called DAMON context. DAMON executes each context with a kernel thread +called kdamond, and multiple kdamonds could run in parallel. + +Under the ``admin`` directory, one directory, ``kdamonds``, which has files for +controlling the kdamonds exist. In the beginning, this directory has only one +file, ``nr``. Writing a number (``N``) to the file creates the number of child +directories named ``0`` to ``N-1``. Each directory represents each kdamond. + +kdamonds// +------------- + +In each kdamond directory, two files (``state`` and ``pid``) and one directory +(``contexts``) exist. + +Reading ``state`` returns ``on`` if the kdamond is currently running, or +``off`` if it is not running. Writing ``on`` or ``off`` makes the kdamond be +in the state. Writing ``update_schemes_stats`` to ``state`` file updates the +contents of stats files for each DAMON-based operation scheme of the kdamond. +For details of the stats, please refer to :ref:`stats section +`. + +If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread. + +``contexts`` directory contains files for controlling the monitoring contexts +that this kdamond will execute. + +kdamonds//contexts/ +---------------------- + +In the beginning, this directory has only one file, ``nr``. Writing a number +(``N``) to the file creates the number of child directories named as ``0`` to +``N-1``. Each directory represents each monitoring context. At the moment, +only one context per kdamond is supported, so only ``0`` or ``1`` can be +written to the file. + +contexts// +------------- + +In each context directory, one file (``operations``) and three directories +(``monitoring_attrs``, ``targets``, and ``schemes``) exist. + +DAMON supports multiple types of monitoring operations, including those for +virtual address space and the physical address space. You can set and get what +type of monitoring operations DAMON will use for the context by writing one of +below keywords to, and reading from the file. + + - vaddr: Monitor virtual address spaces of specific processes + - paddr: Monitor the physical address space of the system + +contexts//monitoring_attrs/ +------------------------------ + +Files for specifying attributes of the monitoring including required quality +and efficiency of the monitoring are in ``monitoring_attrs`` directory. +Specifically, two directories, ``intervals`` and ``nr_regions`` exist in this +directory. + +Under ``intervals`` directory, three files for DAMON's sampling interval +(``sample_us``), aggregation interval (``aggr_us``), and update interval +(``update_us``) exist. You can set and get the values in micro-seconds by +writing to and reading from the files. + +Under ``nr_regions`` directory, two files for the lower-bound and upper-bound +of DAMON's monitoring regions (``min`` and ``max``, respectively), which +controls the monitoring overhead, exist. You can set and get the values by +writing to and rading from the files. + +For more details about the intervals and monitoring regions range, please refer +to the Design document (:doc:`/vm/damon/design`). + +contexts//targets/ +--------------------- + +In the beginning, this directory has only one file, ``nr``. Writing a number +(``N``) to the file creates the number of child directories named ``0`` to +``N-1``. Each directory represents each monitoring target. + +targets// +------------ + +In each target directory, one file (``pid``) and one directory (``regions``) +exist. + +If you wrote ``vaddr`` to the ``contexts//operations``, each target should +be a process. You can specify the process to DAMON by writing the pid of the +process to the ``pid`` file. + +targets//regions +------------------- + +When ``vaddr`` monitoring operations set is being used (``vaddr`` is written to +the ``contexts//operations`` file), DAMON automatically sets and updates the +monitoring target regions so that entire memory mappings of target processes +can be covered. However, users could want to set the initial monitoring region +to specific address ranges. + +In contrast, DAMON do not automatically sets and updates the monitoring target +regions when ``paddr`` monitoring operations set is being used (``paddr`` is +written to the ``contexts//operations``). Therefore, users should set the +monitoring target regions by themselves in the case. + +For such cases, users can explicitly set the initial monitoring target regions +as they want, by writing proper values to the files under this directory. + +In the beginning, this directory has only one file, ``nr``. Writing a number +(``N``) to the file creates the number of child directories named ``0`` to +``N-1``. Each directory represents each initial monitoring target region. + +regions// +------------ + +In each region directory, you will find two files (``start`` and ``end``). You +can set and get the start and end addresses of the initial monitoring target +region by writing to and reading from the files, respectively. + +contexts//schemes/ +--------------------- + +For usual DAMON-based data access aware memory management optimizations, users +would normally want the system to apply a memory management action to a memory +region of a specific access pattern. DAMON receives such formalized operation +schemes from the user and applies those to the target memory regions. Users +can get and set the schemes by reading from and writing to files under this +directory. + +In the beginning, this directory has only one file, ``nr``. Writing a number +(``N``) to the file creates the number of child directories named ``0`` to +``N-1``. Each directory represents each DAMON-based operation scheme. + +schemes// +------------ + +In each scheme directory, four directories (``access_pattern``, ``quotas``, +``watermarks``, and ``stats``) and one file (``action``) exist. + +The ``action`` file is for setting and getting what action you want to apply to +memory regions having specific access pattern of the interest. The keywords +that can be written to and read from the file and their meaning are as below. + + - willneed: Call ``madvise()`` for the region with ``MADV_WILLNEED`` + - cold: Call ``madvise()`` for the region with ``MADV_COLD`` + - pageout: Call ``madvise()`` for the region with ``MADV_PAGEOUT`` + - hugepage: Call ``madvise()`` for the region with ``MADV_HUGEPAGE`` + - nohugepage: Call ``madvise()`` for the region with ``MADV_NOHUGEPAGE`` + - stat: Do nothing but count the statistics + +schemes//access_pattern/ +--------------------------- + +The target access pattern of each DAMON-based operation scheme is constructed +with three ranges including the size of the region in bytes, number of +monitored accesses per aggregate interval, and number of aggregated intervals +for the age of the region. + +Under the ``access_pattern`` directory, three directories (``sz``, +``nr_accesses``, and ``age``) each having two files (``min`` and ``max``) +exist. You can set and get the access pattern for the given scheme by writing +to and reading from the ``min`` and ``max`` files under ``sz``, +``nr_accesses``, and ``age`` directories, respectively. + +schemes//quotas/ +------------------- + +Optimal ``target access pattern`` for each ``action`` is workload dependent, so +not easy to find. Worse yet, setting a scheme of some action too aggressive +can cause severe overhead. To avoid such overhead, users can limit time and +size quota for each scheme. In detail, users can ask DAMON to try to use only +up to specific time (``time quota``) for applying the action, and to apply the +action to only up to specific amount (``size quota``) of memory regions having +the target access pattern within a given time interval (``reset interval``). + +When the quota limit is expected to be exceeded, DAMON prioritizes found memory +regions of the ``target access pattern`` based on their size, access frequency, +and age. For personalized prioritization, users can set the weights for the +three properties. + +Under ``quotas`` directory, three files (``ms``, ``sz``, ``reset_interval_ms``) +and one directory (``weights``) having three files (``sz``, ``nr_accesses``, +and ``age``) in it exist. + +You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and +``reset interval`` in milliseconds by writing the values to the three files, +respectively. You can also set the prioritization weights for size, access +frequency, and age in per-thousand unit by writing the values to the three +files under the ``weights`` directory. + +schemes//watermarks/ +----------------------- + +To allow easy activation and deactivation of each scheme based on system +status, DAMON provides a feature called watermarks. The feature receives five +values called ``metric``, ``interval``, ``high``, ``mid``, and ``low``. The +``metric`` is the system metric such as free memory ratio that can be measured. +If the metric value of the system is higher than the value in ``high`` or lower +than ``low`` at the memoent, the scheme is deactivated. If the value is lower +than ``mid``, the scheme is activated. + +Under the watermarks directory, five files (``metric``, ``interval_us``, +``high``, ``mid``, and ``low``) for setting each value exist. You can set and +get the five values by writing to the files, respectively. + +Keywords and meanings of those that can be written to the ``metric`` file are +as below. + + - none: Ignore the watermarks + - free_mem_rate: System's free memory rate (per thousand) + +The ``interval`` should written in microseconds unit. + +.. _sysfs_schemes_stats: + +schemes//stats/ +------------------ + +DAMON counts the total number and bytes of regions that each scheme is tried to +be applied, the two numbers for the regions that each scheme is successfully +applied, and the total number of the quota limit exceeds. This statistics can +be used for online analysis or tuning of the schemes. + +The statistics can be retrieved by reading the files under ``stats`` directory +(``nr_tried``, ``sz_tried``, ``nr_applied``, ``sz_applied``, and +``qt_exceeds``), respectively. The files are not updated in real time, so you +should ask DAMON sysfs interface to updte the content of the files for the +stats by writing a special keyword, ``update_schemes_stats`` to the relevant +``kdamonds//state`` file. + +Example +~~~~~~~ + +Below commands applies a scheme saying "If a memory region of size in [4KiB, +8KiB] is showing accesses per aggregate interval in [0, 5] for aggregate +interval in [10, 20], page out the region. For the paging out, use only up to +10ms per second, and also don't page out more than 1GiB per second. Under the +limitation, page out memory regions having longer age first. Also, check the +free memory rate of the system every 5 seconds, start the monitoring and paging +out when the free memory rate becomes lower than 50%, but stop it if the free +memory rate becomes larger than 60%, or lower than 30%". :: + + # cd /kernel/mm/damon/admin + # # populate directories + # echo 1 > kdamond/nr; echo 1 > kdamonds/0/contexts/nr; + # echo 1 > kdamonds/0/contexts/0/schemes/nr + # cd kdamonds/0/contexts/0/schemes/0 + # # set the basic access pattern and the action + # echo 4096 > access_patterns/sz/min + # echo 8192 > access_patterns/sz/max + # echo 0 > access_patterns/nr_accesses/min + # echo 5 > access_patterns/nr_accesses/max + # echo 10 > access_patterns/age/min + # echo 20 > access_patterns/age/max + # echo pageout > action + # # set quotas + # echo 10 > quotas/ms + # echo $((1024*1024*1024)) > quotas/sz + # echo 1000 > quotas/reset_interval_ms + # # set watermark + # echo free_mem_rate > watermarks/metric + # echo 5000000 > watermarks/interval_us + # echo 600 > watermarks/high + # echo 500 > watermarks/mid + # echo 300 > watermarks/low + +Please note that it's highly recommended to use user space tools like `damo +`_ rather than manually reading and writing +the files as above. Above is only for an example. .. _debugfs_interface: