From patchwork Wed Dec 20 11:38:33 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 122462 Delivered-To: patch@linaro.org Received: by 10.140.22.227 with SMTP id 90csp5452231qgn; Wed, 20 Dec 2017 03:39:42 -0800 (PST) X-Google-Smtp-Source: ACJfBouvYmbNTvhfIaZuOwRXOOCgs2byOrbN7vyrWpwrOYFE/tFjPf2AOmCVW61AwF7Oe98fMara X-Received: by 10.101.97.75 with SMTP id o11mr5918163pgv.363.1513769982009; Wed, 20 Dec 2017 03:39:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1513769982; cv=none; d=google.com; s=arc-20160816; b=NItxAJrcFH31gY4XOoaxpNOGjNMv2Zzz9msSSDjcurf4plwWRog0uEG0gACTj7xXyL Ji9b5bocMTDiDbC0acZCKT9JqTQXhM7aP67EMipwl8C2uGbUBM3JnItMBE4zp0MbLt5y JZjwXZulcCmCoECHCKEMFY2rtQo23ULGf3JGs6odlfhCp0A07H3Ns+OrZvBLml3be526 sOQ1MOuUoswPD3EOpOFWCnnOj+Vd7S6lIBXff0TwnlbWjS2KimlBOhQXTSQIJpreurgP rc2R008nH7K35vWacOx3RbbNyeqNux61evRjnqp1nDn4yTcMqa4xjKr4ahpK0WqKRxI6 gLlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=; b=TkUkJ90F6Ow/te4PhZiLBGoHIdhDzpWya+jzqqwOGJ94OZNXRo7jPGNO7JSi597MBJ Oad+5AHkMmYq/mF8dLdA+bX5uF4YGJk4cPd+CS1LgwqHjFDP1ZpSrv1VnTpBQz1ScCsf 5ml1GKnO/S/B1yCtzf2jndXKnGTh/4MUfRWCMAYzF/0zVPqG8nlswKobcymAi3ylIwVs JYJp2f1UV9NhfZ7q/VokTYY0nn+dkDQemIGJuqorI3IX31vAp03rJyg2XnXP1um4Zi5I G0m6+RUa1JPROM4cRkqQMUnIqSS2xNNooMjk0tPACuyRS33/cqoeB8bMbGteHl8maFVZ uvxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X+QOL7FB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u63si11752493pgc.269.2017.12.20.03.39.41; Wed, 20 Dec 2017 03:39:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=X+QOL7FB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755307AbdLTLji (ORCPT + 28 others); Wed, 20 Dec 2017 06:39:38 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:36414 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755243AbdLTLjZ (ORCPT ); Wed, 20 Dec 2017 06:39:25 -0500 Received: by mail-wm0-f66.google.com with SMTP id b76so9211847wmg.1 for ; Wed, 20 Dec 2017 03:39:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=; b=X+QOL7FBgT0FS0WSBJS0ofGV0bw+lLrlQmZjLTuuvZR/4DV3iWJIYqwTgrjpcRD626 PAmDt51LsToTvTBczqx96R8s3A6YG7VirZNzYOuE61QjeYl3l4QrGBZ8Y2uuJARiOhy7 RLIPgOSC/kdmy/6bYiUI/tJXkx2cW1fnoix4E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=; b=fc6N1KzUVaQJeWfLj5V2VoDk6YjSLjw1t1S2vxLB/uBP6daUl09I501B5SW9fLcU8R wuCkvvUtcTvxlF68mP1M3NYf0YnhqhTPo9Z75W2JhYpXffd6YuMzQkuC2JHrJIV0qlhp LOHRs+rn2WkNhMsMJvcL4nR3gZKIl1tzp83GRgvvwagaeeGQecS0+5EeqgqZ6/u/yZlm xdDAqsIcgKdhsdaivpLLr/TxqXTeQiNC9+l6M/ZYMZHpDrIBTDBmS7C0vMT2JwnNnqmO +anVHib8Bg6YdLkYLUlF/iidbJEdr5uyw2rnz4CRniLyWmReF6shoy05xtHzBuMqXI7x K/dg== X-Gm-Message-State: AKGB3mIvihTFEa52JCHwDaLfUzHDDUA/FVMixT1ohhKXPzfj4e2iAm9Z nneMrRYOiDrUcFdlIF+3F30Txg== X-Received: by 10.28.191.3 with SMTP id p3mr6093463wmf.81.1513769964049; Wed, 20 Dec 2017 03:39:24 -0800 (PST) Received: from wifi-122_dhcprange-89.wifi.unimo.it (wifi-122_dhcprange-89.wifi.unimo.it. [155.185.122.89]) by smtp.gmail.com with ESMTPSA id o27sm9704436wro.9.2017.12.20.03.39.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 20 Dec 2017 03:39:23 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, angeloruocco90@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH IMPROVEMENT/BUGFIX 3/4] block, bfq: let a queue be merged only shortly after starting I/O Date: Wed, 20 Dec 2017 12:38:33 +0100 Message-Id: <20171220113834.2578-4-paolo.valente@linaro.org> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20171220113834.2578-1-paolo.valente@linaro.org> References: <20171220113834.2578-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In BFQ and CFQ, two processes are said to be cooperating if they do I/O in such a way that the union of their I/O requests yields a sequential I/O pattern. To get such a sequential I/O pattern out of the non-sequential pattern of each cooperating process, BFQ and CFQ merge the queues associated with these processes. In more detail, cooperating processes, and thus their associated queues, usually start, or restart, to do I/O shortly after each other. This is the case, e.g., for the I/O threads of KVM/QEMU and of the dump utility. Basing on this assumption, this commit allows a bfq_queue to be merged only during a short time interval (100ms) after it starts, or re-starts, to do I/O. This filtering provides two important benefits. First, it greatly reduces the probability that two non-cooperating processes have their queues merged by mistake, if they just happen to do I/O close to each other for a short time interval. These spurious merges cause loss of service guarantees. A low-weight bfq_queue may unjustly get more than its expected share of the throughput: if such a low-weight queue is merged with a high-weight queue, then the I/O for the low-weight queue is served as if the queue had a high weight. This may damage other high-weight queues unexpectedly. For instance, because of this issue, lxterminal occasionally took 7.5 seconds to start, instead of 6.5 seconds, when some sequential readers and writers did I/O in the background on a FUJITSU MHX2300BT HDD. The reason is that the bfq_queues associated with some of the readers or the writers were merged with the high-weight queues of some processes that had to do some urgent but little I/O. The readers then exploited the inherited high weight for all or most of their I/O, during the start-up of terminal. The filtering introduced by this commit eliminated any outlier caused by spurious queue merges in our start-up time tests. This filtering also provides a little boost of the throughput sustainable by BFQ: 3-4%, depending on the CPU. The reason is that, once a bfq_queue cannot be merged any longer, this commit makes BFQ stop updating the data needed to handle merging for the queue. Signed-off-by: Paolo Valente Signed-off-by: Angelo Ruocco --- block/bfq-iosched.c | 57 ++++++++++++++++++++++++++++++++++++++++++----------- block/bfq-iosched.h | 2 ++ block/bfq-wf2q.c | 4 ++++ 3 files changed, 52 insertions(+), 11 deletions(-) -- 2.15.1 diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index f58367ef98c1..320022135dc8 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -166,6 +166,20 @@ static const int bfq_async_charge_factor = 10; /* Default timeout values, in jiffies, approximating CFQ defaults. */ const int bfq_timeout = HZ / 8; +/* + * Time limit for merging (see comments in bfq_setup_cooperator). Set + * to the slowest value that, in our tests, proved to be effective in + * removing false positives, while not causing true positives to miss + * queue merging. + * + * As can be deduced from the low time limit below, queue merging, if + * successful, happens at the very beggining of the I/O of the involved + * cooperating processes, as a consequence of the arrival of the very + * first requests from each cooperator. After that, there is very + * little chance to find cooperators. + */ +static const unsigned long bfq_merge_time_limit = HZ/10; + static struct kmem_cache *bfq_pool; /* Below this threshold (in ns), we consider thinktime immediate. */ @@ -444,6 +458,13 @@ bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root, return bfqq; } +static bool bfq_too_late_for_merging(struct bfq_queue *bfqq) +{ + return bfqq->service_from_backlogged > 0 && + time_is_before_jiffies(bfqq->first_IO_time + + bfq_merge_time_limit); +} + void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq) { struct rb_node **p, *parent; @@ -454,6 +475,14 @@ void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfqq->pos_root = NULL; } + /* + * bfqq cannot be merged any longer (see comments in + * bfq_setup_cooperator): no point in adding bfqq into the + * position tree. + */ + if (bfq_too_late_for_merging(bfqq)) + return; + if (bfq_class_idle(bfqq)) return; if (!bfqq->next_rq) @@ -1935,6 +1964,9 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq) static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq) { + if (bfq_too_late_for_merging(new_bfqq)) + return false; + if (bfq_class_idle(bfqq) || bfq_class_idle(new_bfqq) || (bfqq->ioprio_class != new_bfqq->ioprio_class)) return false; @@ -2003,6 +2035,20 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq, { struct bfq_queue *in_service_bfqq, *new_bfqq; + /* + * Prevent bfqq from being merged if it has been created too + * long ago. The idea is that true cooperating processes, and + * thus their associated bfq_queues, are supposed to be + * created shortly after each other. This is the case, e.g., + * for KVM/QEMU and dump I/O threads. Basing on this + * assumption, the following filtering greatly reduces the + * probability that two non-cooperating processes, which just + * happen to do close I/O for some short time interval, have + * their queues merged by mistake. + */ + if (bfq_too_late_for_merging(bfqq)) + return NULL; + if (bfqq->new_bfqq) return bfqq->new_bfqq; @@ -3044,17 +3090,6 @@ void bfq_bfqq_expire(struct bfq_data *bfqd, */ slow = bfq_bfqq_is_slow(bfqd, bfqq, compensate, reason, &delta); - /* - * Increase service_from_backlogged before next statement, - * because the possible next invocation of - * bfq_bfqq_charge_time would likely inflate - * entity->service. In contrast, service_from_backlogged must - * contain real service, to enable the soft real-time - * heuristic to correctly compute the bandwidth consumed by - * bfqq. - */ - bfqq->service_from_backlogged += entity->service; - /* * As above explained, charge slow (typically seeky) and * timed-out queues with the time and not the service diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 91c4390903a1..5d47b58d5fc8 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -344,6 +344,8 @@ struct bfq_queue { unsigned long wr_start_at_switch_to_srt; unsigned long split_time; /* time of last split */ + + unsigned long first_IO_time; /* time of first I/O for this queue */ }; /** diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index e495d3f9b4b0..4456eda34e48 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -835,6 +835,10 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served) struct bfq_entity *entity = &bfqq->entity; struct bfq_service_tree *st; + if (!bfqq->service_from_backlogged) + bfqq->first_IO_time = jiffies; + + bfqq->service_from_backlogged += served; for_each_entity(entity) { st = bfq_entity_service_tree(entity);