From patchwork Fri Dec 15 06:23:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 122043 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp71508qgn; Thu, 14 Dec 2017 22:23:51 -0800 (PST) X-Google-Smtp-Source: ACJfBotBtns1g7KCkw4tTCrW9oHb1LEh5MW0mhGQ4EDg6SxW86D+pkLeKlBSi2WbyjeJmDVnemdQ X-Received: by 10.159.194.18 with SMTP id x18mr12121019pln.31.1513319031466; Thu, 14 Dec 2017 22:23:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1513319031; cv=none; d=google.com; s=arc-20160816; b=psLQfiKjJMczorJIx/0EIJAfyCpW3g7b0ujcVgDPI+FTu3hEu/HFN3/ZcB3oiknjQL P8rwl7RsD0wMT22NXJofUlt+r1Q+t6jwtR5V2VAzMJGyG7hZARWwnbcD8yLkmJmIb1QW bxi+XbJPIcH/zWW0pWJmunyTK2g8BXNv0Y0WYMXx8A0j6Z7ZwNvHk+/+GLoroxl1A9nS LH2pjB1QsI4fup2x2NjAqyMhlZbCbj+KKTMzZ6joKkmxzcNDcI3rEROb/Z/LsH8kYaOu 7WUlnUxHpz/2eA9+OJWydg9jpqs2NXxQX7AeaF76D10XQeCB3a2MVv7OYH5WxODS5A3Q NZww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=BTcSqrKCFOOZUcw2FUqPoAU5vj2WI/YAZoIJ1mvPY60=; b=Toszpuq3FdSxAl8gQs3OONu7VjfjsNnEapNACcfkT9FN1unRXhyMRXzfbsFFmvBJsI OGmpbc0Ax5r/tzTfS93w3izi6RH+b1A2rkvwDTPLO71rvpcnHxZ5NT3TELHtsWLyRmMb lL6/UzL9JNTerFRr0NzRuofaneqEsgraem4Lbq/PUY2m+9z+KTwQaYhQ67cdWezW3gAH yaxTwf1c/DonoPXxu3lTWatit2sfAmDXhIhQ/HHzZ2+kpcOXif3l2RAEt7cN7Gqsbxnu a9fivL7wRtE0ZwjQBtQaPuArmk5X7OXGT6PkfpQcJr4BWPcquy4il7haD8TpEP7lFfi7 pzTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PSo6Tcmw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h9si4072176pgs.461.2017.12.14.22.23.51; Thu, 14 Dec 2017 22:23:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PSo6Tcmw; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752839AbdLOGXr (ORCPT + 20 others); Fri, 15 Dec 2017 01:23:47 -0500 Received: from mail-wr0-f195.google.com ([209.85.128.195]:35986 "EHLO mail-wr0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751506AbdLOGXf (ORCPT ); Fri, 15 Dec 2017 01:23:35 -0500 Received: by mail-wr0-f195.google.com with SMTP id u19so729017wrc.3 for ; Thu, 14 Dec 2017 22:23:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=BTcSqrKCFOOZUcw2FUqPoAU5vj2WI/YAZoIJ1mvPY60=; b=PSo6Tcmw1QZtnEJlKSt1UuHZlhndhwWWtZk9NvwogliKXRZ4bth/9PWLck2mUH7kIv t5Lyh/CUWlbafcKZjLSFCk+w2KZzkk1AuoAki7/05ygA1MA3UETKEvDImGAkdC1+ZzDE GwEhsLTBSnFHDAfWFVQgNCOYRxAkOCvBMQuhc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=BTcSqrKCFOOZUcw2FUqPoAU5vj2WI/YAZoIJ1mvPY60=; b=d7Gp5bY0z5yrGlr302xDgtsZ0FFBF1ewV9vfgwYnCJ9omB93ECdzv9riP/iHhXLAfQ sw9JgyDGS4pY7mDb/KQXXrPZg5SPIbRhACV07vJsxVdVrPSD0Ei+7gDIStMhV5CN2U18 gNe/dniNq6NN0/QwiAPqWCD5j6n+DJpHlD1WCVPzhUfHXJ37MRloQBmJYgnsGCRWmJvt tPEb0Ai/hU6YeEHP7pssAjVQgtdLTVUKac9m+Go/TT1S5Pasv7S7IFSnfZjQXUhiQchv xOvMS4eY/LyoQRkrDhm+jpA+NNcAvJM+MShXvzkOlmDSS+B+EHXv1OjYkyC57K7B+XE5 vyrA== X-Gm-Message-State: AKGB3mLf/AgA7qqbt3yyZzWih+tMhWGwVSr7ybddKNKGu6vG6IJqsxCQ s4zv+alGNFOBNn6X+uSNs7HCaQ== X-Received: by 10.223.158.13 with SMTP id u13mr8026007wre.134.1513319014323; Thu, 14 Dec 2017 22:23:34 -0800 (PST) Received: from localhost.localdomain (146-241-24-74.dyn.eolo.it. [146.241.24.74]) by smtp.gmail.com with ESMTPSA id e197sm17862478wmf.42.2017.12.14.22.23.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 14 Dec 2017 22:23:33 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, angeloruocco90@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH IMPROVEMENT] block, bfq: consider also past I/O in soft real-time detection Date: Fri, 15 Dec 2017 07:23:12 +0100 Message-Id: <20171215062312.1836-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171215062312.1836-1-paolo.valente@linaro.org> References: <20171215062312.1836-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org BFQ privileges the I/O of soft real-time applications, such as video players, to guarantee to these application a high bandwidth and a low latency. In this respect, it is not easy to correctly detect when an application is soft real-time. A particularly nasty false positive is that of an I/O-bound application that occasionally happens to meet all requirements to be deemed as soft real-time. After being detected as soft real-time, such an application monopolizes the device. Fortunately, BFQ will realize soon that the application is actually not soft real-time and suspend every privilege. Yet, the application may happen again to be wrongly detected as soft real-time, and so on. As highlighted by our tests, this problem causes BFQ to occasionally fail to guarantee a high responsiveness, in the presence of heavy background I/O workloads. The reason is that the background workload happens to be detected as soft real-time, more or less frequently, during the execution of the interactive task under test. To give an idea, because of this problem, Libreoffice Writer occasionally takes 8 seconds, instead of 3, to start up, if there are sequential reads and writes in the background, on a Kingston SSDNow V300. This commit addresses this issue by leveraging the following facts. The reason why some applications are detected as soft real-time despite all BFQ checks to avoid false positives, is simply that, during high CPU or storage-device load, I/O-bound applications may happen to do I/O slowly enough to meet all soft real-time requirements, and pass all BFQ extra checks. Yet, this happens only for limited time periods: slow-speed time intervals are usually interspersed between other time intervals during which these applications do I/O at a very high speed. To exploit these facts, this commit introduces a little change, in the detection of soft real-time behavior, to systematically consider also the recent past: the higher the speed was in the recent past, the later next I/O should arrive for the application to be considered as soft real-time. At the beginning of a slow-speed interval, the minimum arrival time allowed for the next I/O usually happens to still be so high, to fall *after* the end of the slow-speed period itself. As a consequence, the application does not risk to be deemed as soft real-time during the slow-speed interval. Then, during the next high-speed interval, the application cannot, evidently, be deemed as soft real-time (exactly because of its speed), and so on. This extra filtering proved to be rather effective: in the above test, the frequency of false positives became so low that the start-up time was 3 seconds in all iterations (apart from occasional outliers, caused by page-cache-management issues, which are out of the scope of this commit, and cannot be solved by an I/O scheduler). Signed-off-by: Paolo Valente Signed-off-by: Angelo Ruocco --- block/bfq-iosched.c | 115 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 81 insertions(+), 34 deletions(-) -- 2.10.0 diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index bcb6d21..a9e06217 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2917,45 +2917,87 @@ static bool bfq_bfqq_is_slow(struct bfq_data *bfqd, struct bfq_queue *bfqq, * whereas soft_rt_next_start is set to infinity for applications that do * not. * - * Unfortunately, even a greedy application may happen to behave in an - * isochronous way if the CPU load is high. In fact, the application may - * stop issuing requests while the CPUs are busy serving other processes, - * then restart, then stop again for a while, and so on. In addition, if - * the disk achieves a low enough throughput with the request pattern - * issued by the application (e.g., because the request pattern is random - * and/or the device is slow), then the application may meet the above - * bandwidth requirement too. To prevent such a greedy application to be - * deemed as soft real-time, a further rule is used in the computation of - * soft_rt_next_start: soft_rt_next_start must be higher than the current - * time plus the maximum time for which the arrival of a request is waited - * for when a sync queue becomes idle, namely bfqd->bfq_slice_idle. - * This filters out greedy applications, as the latter issue instead their - * next request as soon as possible after the last one has been completed - * (in contrast, when a batch of requests is completed, a soft real-time - * application spends some time processing data). + * Unfortunately, even a greedy (i.e., I/O-bound) application may + * happen to meet, occasionally or systematically, both the above + * bandwidth and isochrony requirements. This may happen at least in + * the following circumstances. First, if the CPU load is high. The + * application may stop issuing requests while the CPUs are busy + * serving other processes, then restart, then stop again for a while, + * and so on. The other circumstances are related to the storage + * device: the storage device is highly loaded or reaches a low-enough + * throughput with the I/O of the application (e.g., because the I/O + * is random and/or the device is slow). In all these cases, the + * I/O of the application may be simply slowed down enough to meet + * the bandwidth and isochrony requirements. To reduce the probability + * that greedy applications are deemed as soft real-time in these + * corner cases, a further rule is used in the computation of + * soft_rt_next_start: the return value of this function is forced to + * be higher than the maximum between the following two quantities. * - * Unfortunately, the last filter may easily generate false positives if - * only bfqd->bfq_slice_idle is used as a reference time interval and one - * or both the following cases occur: - * 1) HZ is so low that the duration of a jiffy is comparable to or higher - * than bfqd->bfq_slice_idle. This happens, e.g., on slow devices with - * HZ=100. + * (a) Current time plus: (1) the maximum time for which the arrival + * of a request is waited for when a sync queue becomes idle, + * namely bfqd->bfq_slice_idle, and (2) a few extra jiffies. We + * postpone for a moment the reason for adding a few extra + * jiffies; we get back to it after next item (b). Lower-bounding + * the return value of this function with the current time plus + * bfqd->bfq_slice_idle tends to filter out greedy applications, + * because the latter issue their next request as soon as possible + * after the last one has been completed. In contrast, a soft + * real-time application spends some time processing data, after a + * batch of its requests has been completed. + * + * (b) Current value of bfqq->soft_rt_next_start. As pointed out + * above, greedy applications may happen to meet both the + * bandwidth and isochrony requirements under heavy CPU or + * storage-device load. In more detail, in these scenarios, these + * applications happen, only for limited time periods, to do I/O + * slowly enough to meet all the requirements described so far, + * including the filtering in above item (a). These slow-speed + * time intervals are usually interspersed between other time + * intervals during which these applications do I/O at a very high + * speed. Fortunately, exactly because of the high speed of the + * I/O in the high-speed intervals, the values returned by this + * function happen to be so high, near the end of any such + * high-speed interval, to be likely to fall *after* the end of + * the low-speed time interval that follows. These high values are + * stored in bfqq->soft_rt_next_start after each invocation of + * this function. As a consequence, if the last value of + * bfqq->soft_rt_next_start is constantly used to lower-bound the + * next value that this function may return, then, from the very + * beginning of a low-speed interval, bfqq->soft_rt_next_start is + * likely to be constantly kept so high that any I/O request + * issued during the low-speed interval is considered as arriving + * to soon for the application to be deemed as soft + * real-time. Then, in the high-speed interval that follows, the + * application will not be deemed as soft real-time, just because + * it will do I/O at a high speed. And so on. + * + * Getting back to the filtering in item (a), in the following two + * cases this filtering might be easily passed by a greedy + * application, if the reference quantity was just + * bfqd->bfq_slice_idle: + * 1) HZ is so low that the duration of a jiffy is comparable to or + * higher than bfqd->bfq_slice_idle. This happens, e.g., on slow + * devices with HZ=100. The time granularity may be so coarse + * that the approximation, in jiffies, of bfqd->bfq_slice_idle + * is rather lower than the exact value. * 2) jiffies, instead of increasing at a constant rate, may stop increasing * for a while, then suddenly 'jump' by several units to recover the lost * increments. This seems to happen, e.g., inside virtual machines. - * To address this issue, we do not use as a reference time interval just - * bfqd->bfq_slice_idle, but bfqd->bfq_slice_idle plus a few jiffies. In - * particular we add the minimum number of jiffies for which the filter - * seems to be quite precise also in embedded systems and KVM/QEMU virtual - * machines. + * To address this issue, in the filtering in (a) we do not use as a + * reference time interval just bfqd->bfq_slice_idle, but + * bfqd->bfq_slice_idle plus a few jiffies. In particular, we add the + * minimum number of jiffies for which the filter seems to be quite + * precise also in embedded systems and KVM/QEMU virtual machines. */ static unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd, struct bfq_queue *bfqq) { - return max(bfqq->last_idle_bklogged + - HZ * bfqq->service_from_backlogged / - bfqd->bfq_wr_max_softrt_rate, - jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4); + return max3(bfqq->soft_rt_next_start, + bfqq->last_idle_bklogged + + HZ * bfqq->service_from_backlogged / + bfqd->bfq_wr_max_softrt_rate, + jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4); } /** @@ -4002,10 +4044,15 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, bfqq->split_time = bfq_smallest_from_now(); /* - * Set to the value for which bfqq will not be deemed as - * soft rt when it becomes backlogged. + * To not forget the possibly high bandwidth consumed by a + * process/queue in the recent past, + * bfq_bfqq_softrt_next_start() returns a value at least equal + * to the current value of bfqq->soft_rt_next_start (see + * comments on bfq_bfqq_softrt_next_start). Set + * soft_rt_next_start to now, to mean that bfqq has consumed + * no bandwidth so far. */ - bfqq->soft_rt_next_start = bfq_greatest_from_now(); + bfqq->soft_rt_next_start = jiffies; /* first request is almost certainly seeky */ bfqq->seek_history = 1;