From patchwork Mon Mar 4 20:16:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Loehle X-Patchwork-Id: 777919 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7F09B7A150; Mon, 4 Mar 2024 20:17:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709583432; cv=none; b=lTlJNrQeKHXMh749AAPGiwZc6bmccYVoaD14ACVy3Jgwzqscq5nrf+g/LFK2SUvTLtPzVwXGaK0RdyJ01qaZyTMNfC+9UpS8t3qcqYhNsc+MfV3opE4+i5y0b5O2iVOr0Qy2m8eDe3w85VmiUNOWotRDkYmF7LB4EgX2driig00= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709583432; c=relaxed/simple; bh=SlMHjZE0dLmw6SkxHSgROl2ZAtCQ7j4/V3eBjzNa6hA=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=WGn72lQa6RktJgkTSE9jszJ1M8TZgpmY5HyNV0MbRxQGJa2oHXs4jrbww0w4tidYQsKR1+lt1Oj5Eqb6amdSLdOwIRUkvNZbIaLumoNjTHFFy9Ibi6zxXnH2f1Quw48eOJbC3m66/iUKWYwlFuidfxAP7emGSvwxWIxlxAVUZrQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2FCBF2F4; Mon, 4 Mar 2024 12:17:37 -0800 (PST) Received: from e133047.arm.com (unknown [10.57.95.7]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id B06613F738; Mon, 4 Mar 2024 12:16:56 -0800 (PST) From: Christian Loehle To: linux-kernel@vger.kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, mingo@redhat.com, rafael@kernel.org, dietmar.eggemann@arm.com, vschneid@redhat.com, vincent.guittot@linaro.org, Johannes.Thumshirn@wdc.com, adrian.hunter@intel.com, ulf.hansson@linaro.org, andres@anarazel.de, asml.silence@gmail.com, linux-pm@vger.kernel.org, linux-block@vger.kernel.org, io-uring@vger.kernel.org, Christian Loehle Subject: [RFC PATCH 0/2] Introduce per-task io utilization boost Date: Mon, 4 Mar 2024 20:16:23 +0000 Message-Id: <20240304201625.100619-1-christian.loehle@arm.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 There is a feature inside of both schedutil and intel_pstate called iowait boosting which tries to prevent selecting a low frequency during IO workloads when it impacts throughput. The feature is implemented by checking for task wakeups that have the in_iowait flag set and boost the CPU of the rq accordingly (implemented through cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT)). The necessity of the feature is argued with the potentially low utilization of a task being frequently in_iowait (i.e. most of the time not enqueued on any rq and cannot build up utilization). The RFC focuses on the schedutil implementation. intel_pstate frequency selection isn't touched for now, suggestions are very welcome. Current schedutil iowait boosting has several issues: 1. Boosting happens even in scenarios where it doesn't improve throughput. [1] 2. The boost is not accounted for in EAS: a) feec() will only consider the actual utilization for task placement, but another CPU might be more energy-efficient at that capacity than the boosted one.) b) When placing a non-IO task while a CPU is boosted compute_energy() will not consider the (potentially 'free') boosted capacity, but the one it would have without the boost (since the boost is only applied in sugov). 3. Actual IO heavy workloads are hardly distinguished from infrequent in_iowait wakeups. 4. The boost isn't associated with a task, it therefore isn't considered for task placement, potentially missing out on higher capacity CPUs on heterogeneous CPU topologies. 5. The boost isn't associated with a task, it therefore lingers on the rq even after the responsible task has migrated / stopped. 6. The boost isn't associated with a task, it therefore needs to ramp up again when migrated. 7. Since schedutil doesn't know which task is getting woken up, multiple unrelated in_iowait tasks might lead to boosting. We attempt to mitigate all of the above by reworking the way the iowait boosting (io boosting from here on) works in two major ways: - Carry the boost in task_struct, so it is a per-task attribute and behaves similar to utilization of the task in some ways. - Employ a counting-based tracking strategy that only boosts as long as it sees benefits and returns to no boosting dynamically. Note that some the issues (1, 3) can be solved by using a counting-based strategy on a per-rq basis, i.e. in sugov entirely. Experiments with Android in particular showed that such a strategy (which necessarily needs longer intervals to be reasonably stable) is too prone to migrations to be useful generally. We therefore consider the additional complexity of such a per-task based approach like proposed to be worth it. We require a minimum of 1000 iowait wakeups per second to start boosting. This isn't too far off from what sugov currently does, since it resets the boost if it hasn't seen an iowait wakeup for TICK_NSEC. For CONFIG_HZ=1000 we are on par, for anything below we are stricter. We justify this by the small possible improvement by boosting in the first place with 'rare' few iowait wakeups. When IO even leads to a task being in iowait isn't as straightforward to explain. Of course if the issued IO can be served by the page cache (e.g. on reads because the pages are contained, on writes because they can be marked dirty and the writeback takes care of it later) the actual issuing task is usually not in iowait. We consider this the good case, since whenever the scheduler and a potential userspace / kernel switch is in the critical path for IO there is possibly overhead impacting throughput. We therefore focus on random read from here on, because (on synchronous IO [3]) this will lead to the task being set in iowait for every IO. This is where iowait boosting shows its biggest throughput improvement.