From patchwork Wed Dec 20 11:38:33 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Paolo Valente <paolo.valente@linaro.org>
X-Patchwork-Id: 122462
Delivered-To: patch@linaro.org
Received: by 10.140.22.227 with SMTP id 90csp5452231qgn;
 Wed, 20 Dec 2017 03:39:42 -0800 (PST)
X-Google-Smtp-Source: ACJfBouvYmbNTvhfIaZuOwRXOOCgs2byOrbN7vyrWpwrOYFE/tFjPf2AOmCVW61AwF7Oe98fMara
X-Received: by 10.101.97.75 with SMTP id o11mr5918163pgv.363.1513769982009; 
 Wed, 20 Dec 2017 03:39:42 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1513769982; cv=none;
 d=google.com; s=arc-20160816;
 b=NItxAJrcFH31gY4XOoaxpNOGjNMv2Zzz9msSSDjcurf4plwWRog0uEG0gACTj7xXyL
 Ji9b5bocMTDiDbC0acZCKT9JqTQXhM7aP67EMipwl8C2uGbUBM3JnItMBE4zp0MbLt5y
 JZjwXZulcCmCoECHCKEMFY2rtQo23ULGf3JGs6odlfhCp0A07H3Ns+OrZvBLml3be526
 sOQ1MOuUoswPD3EOpOFWCnnOj+Vd7S6lIBXff0TwnlbWjS2KimlBOhQXTSQIJpreurgP
 rc2R008nH7K35vWacOx3RbbNyeqNux61evRjnqp1nDn4yTcMqa4xjKr4ahpK0WqKRxI6
 gLlw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816; 
 h=list-id:precedence:sender:references:in-reply-to:message-id:date
 :subject:cc:to:from:dkim-signature:arc-authentication-results;
 bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=;
 b=TkUkJ90F6Ow/te4PhZiLBGoHIdhDzpWya+jzqqwOGJ94OZNXRo7jPGNO7JSi597MBJ
 Oad+5AHkMmYq/mF8dLdA+bX5uF4YGJk4cPd+CS1LgwqHjFDP1ZpSrv1VnTpBQz1ScCsf
 5ml1GKnO/S/B1yCtzf2jndXKnGTh/4MUfRWCMAYzF/0zVPqG8nlswKobcymAi3ylIwVs
 JYJp2f1UV9NhfZ7q/VokTYY0nn+dkDQemIGJuqorI3IX31vAp03rJyg2XnXP1um4Zi5I
 G0m6+RUa1JPROM4cRkqQMUnIqSS2xNNooMjk0tPACuyRS33/cqoeB8bMbGteHl8maFVZ
 uvxA==
ARC-Authentication-Results: i=1; mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=X+QOL7FB;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 u63si11752493pgc.269.2017.12.20.03.39.41; 
 Wed, 20 Dec 2017 03:39:41 -0800 (PST)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=X+QOL7FB;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1755307AbdLTLji (ORCPT <rfc822; dan.rue@linaro.org> + 28 others); 
 Wed, 20 Dec 2017 06:39:38 -0500
Received: from mail-wm0-f66.google.com ([74.125.82.66]:36414 "EHLO
 mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1755243AbdLTLjZ (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Wed, 20 Dec 2017 06:39:25 -0500
Received: by mail-wm0-f66.google.com with SMTP id b76so9211847wmg.1
 for <linux-kernel@vger.kernel.org>;
 Wed, 20 Dec 2017 03:39:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references;
 bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=;
 b=X+QOL7FBgT0FS0WSBJS0ofGV0bw+lLrlQmZjLTuuvZR/4DV3iWJIYqwTgrjpcRD626
 PAmDt51LsToTvTBczqx96R8s3A6YG7VirZNzYOuE61QjeYl3l4QrGBZ8Y2uuJARiOhy7
 RLIPgOSC/kdmy/6bYiUI/tJXkx2cW1fnoix4E=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references;
 bh=NbQMGpqCravqWvdwQ0/erZYhDw1gWLCsefWSq5SO9QY=;
 b=fc6N1KzUVaQJeWfLj5V2VoDk6YjSLjw1t1S2vxLB/uBP6daUl09I501B5SW9fLcU8R
 wuCkvvUtcTvxlF68mP1M3NYf0YnhqhTPo9Z75W2JhYpXffd6YuMzQkuC2JHrJIV0qlhp
 LOHRs+rn2WkNhMsMJvcL4nR3gZKIl1tzp83GRgvvwagaeeGQecS0+5EeqgqZ6/u/yZlm
 xdDAqsIcgKdhsdaivpLLr/TxqXTeQiNC9+l6M/ZYMZHpDrIBTDBmS7C0vMT2JwnNnqmO
 +anVHib8Bg6YdLkYLUlF/iidbJEdr5uyw2rnz4CRniLyWmReF6shoy05xtHzBuMqXI7x
 K/dg==
X-Gm-Message-State: AKGB3mIvihTFEa52JCHwDaLfUzHDDUA/FVMixT1ohhKXPzfj4e2iAm9Z
 nneMrRYOiDrUcFdlIF+3F30Txg==
X-Received: by 10.28.191.3 with SMTP id p3mr6093463wmf.81.1513769964049;
 Wed, 20 Dec 2017 03:39:24 -0800 (PST)
Received: from wifi-122_dhcprange-89.wifi.unimo.it
 (wifi-122_dhcprange-89.wifi.unimo.it. [155.185.122.89])
 by smtp.gmail.com with ESMTPSA id
 o27sm9704436wro.9.2017.12.20.03.39.22
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Wed, 20 Dec 2017 03:39:23 -0800 (PST)
From: Paolo Valente <paolo.valente@linaro.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
 ulf.hansson@linaro.org, broonie@kernel.org,
 linus.walleij@linaro.org, angeloruocco90@gmail.com,
 bfq-iosched@googlegroups.com, Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH IMPROVEMENT/BUGFIX 3/4] block,
 bfq: let a queue be merged only shortly after starting I/O
Date: Wed, 20 Dec 2017 12:38:33 +0100
Message-Id: <20171220113834.2578-4-paolo.valente@linaro.org>
X-Mailer: git-send-email 2.15.1
In-Reply-To: <20171220113834.2578-1-paolo.valente@linaro.org>
References: <20171220113834.2578-1-paolo.valente@linaro.org>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

In BFQ and CFQ, two processes are said to be cooperating if they do
I/O in such a way that the union of their I/O requests yields a
sequential I/O pattern. To get such a sequential I/O pattern out of
the non-sequential pattern of each cooperating process, BFQ and CFQ
merge the queues associated with these processes. In more detail,
cooperating processes, and thus their associated queues, usually
start, or restart, to do I/O shortly after each other. This is the
case, e.g., for the I/O threads of KVM/QEMU and of the dump
utility. Basing on this assumption, this commit allows a bfq_queue to
be merged only during a short time interval (100ms) after it starts,
or re-starts, to do I/O.  This filtering provides two important
benefits.

First, it greatly reduces the probability that two non-cooperating
processes have their queues merged by mistake, if they just happen to
do I/O close to each other for a short time interval. These spurious
merges cause loss of service guarantees. A low-weight bfq_queue may
unjustly get more than its expected share of the throughput: if such a
low-weight queue is merged with a high-weight queue, then the I/O for
the low-weight queue is served as if the queue had a high weight. This
may damage other high-weight queues unexpectedly.  For instance,
because of this issue, lxterminal occasionally took 7.5 seconds to
start, instead of 6.5 seconds, when some sequential readers and
writers did I/O in the background on a FUJITSU MHX2300BT HDD.  The
reason is that the bfq_queues associated with some of the readers or
the writers were merged with the high-weight queues of some processes
that had to do some urgent but little I/O. The readers then exploited
the inherited high weight for all or most of their I/O, during the
start-up of terminal. The filtering introduced by this commit
eliminated any outlier caused by spurious queue merges in our start-up
time tests.

This filtering also provides a little boost of the throughput
sustainable by BFQ: 3-4%, depending on the CPU. The reason is that,
once a bfq_queue cannot be merged any longer, this commit makes BFQ
stop updating the data needed to handle merging for the queue.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Angelo Ruocco <angeloruocco90@gmail.com>
---
 block/bfq-iosched.c | 57 ++++++++++++++++++++++++++++++++++++++++++-----------
 block/bfq-iosched.h |  2 ++
 block/bfq-wf2q.c    |  4 ++++
 3 files changed, 52 insertions(+), 11 deletions(-)

-- 
2.15.1

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index f58367ef98c1..320022135dc8 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -166,6 +166,20 @@ static const int bfq_async_charge_factor = 10;
 /* Default timeout values, in jiffies, approximating CFQ defaults. */
 const int bfq_timeout = HZ / 8;
 
+/*
+ * Time limit for merging (see comments in bfq_setup_cooperator). Set
+ * to the slowest value that, in our tests, proved to be effective in
+ * removing false positives, while not causing true positives to miss
+ * queue merging.
+ *
+ * As can be deduced from the low time limit below, queue merging, if
+ * successful, happens at the very beggining of the I/O of the involved
+ * cooperating processes, as a consequence of the arrival of the very
+ * first requests from each cooperator.  After that, there is very
+ * little chance to find cooperators.
+ */
+static const unsigned long bfq_merge_time_limit = HZ/10;
+
 static struct kmem_cache *bfq_pool;
 
 /* Below this threshold (in ns), we consider thinktime immediate. */
@@ -444,6 +458,13 @@ bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root,
 	return bfqq;
 }
 
+static bool bfq_too_late_for_merging(struct bfq_queue *bfqq)
+{
+	return bfqq->service_from_backlogged > 0 &&
+		time_is_before_jiffies(bfqq->first_IO_time +
+				       bfq_merge_time_limit);
+}
+
 void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 {
 	struct rb_node **p, *parent;
@@ -454,6 +475,14 @@ void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 		bfqq->pos_root = NULL;
 	}
 
+	/*
+	 * bfqq cannot be merged any longer (see comments in
+	 * bfq_setup_cooperator): no point in adding bfqq into the
+	 * position tree.
+	 */
+	if (bfq_too_late_for_merging(bfqq))
+		return;
+
 	if (bfq_class_idle(bfqq))
 		return;
 	if (!bfqq->next_rq)
@@ -1935,6 +1964,9 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
 static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq,
 					struct bfq_queue *new_bfqq)
 {
+	if (bfq_too_late_for_merging(new_bfqq))
+		return false;
+
 	if (bfq_class_idle(bfqq) || bfq_class_idle(new_bfqq) ||
 	    (bfqq->ioprio_class != new_bfqq->ioprio_class))
 		return false;
@@ -2003,6 +2035,20 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 {
 	struct bfq_queue *in_service_bfqq, *new_bfqq;
 
+	/*
+	 * Prevent bfqq from being merged if it has been created too
+	 * long ago. The idea is that true cooperating processes, and
+	 * thus their associated bfq_queues, are supposed to be
+	 * created shortly after each other. This is the case, e.g.,
+	 * for KVM/QEMU and dump I/O threads. Basing on this
+	 * assumption, the following filtering greatly reduces the
+	 * probability that two non-cooperating processes, which just
+	 * happen to do close I/O for some short time interval, have
+	 * their queues merged by mistake.
+	 */
+	if (bfq_too_late_for_merging(bfqq))
+		return NULL;
+
 	if (bfqq->new_bfqq)
 		return bfqq->new_bfqq;
 
@@ -3044,17 +3090,6 @@ void bfq_bfqq_expire(struct bfq_data *bfqd,
 	 */
 	slow = bfq_bfqq_is_slow(bfqd, bfqq, compensate, reason, &delta);
 
-	/*
-	 * Increase service_from_backlogged before next statement,
-	 * because the possible next invocation of
-	 * bfq_bfqq_charge_time would likely inflate
-	 * entity->service. In contrast, service_from_backlogged must
-	 * contain real service, to enable the soft real-time
-	 * heuristic to correctly compute the bandwidth consumed by
-	 * bfqq.
-	 */
-	bfqq->service_from_backlogged += entity->service;
-
 	/*
 	 * As above explained, charge slow (typically seeky) and
 	 * timed-out queues with the time and not the service
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 91c4390903a1..5d47b58d5fc8 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -344,6 +344,8 @@ struct bfq_queue {
 	unsigned long wr_start_at_switch_to_srt;
 
 	unsigned long split_time; /* time of last split */
+
+	unsigned long first_IO_time; /* time of first I/O for this queue */
 };
 
 /**
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index e495d3f9b4b0..4456eda34e48 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -835,6 +835,10 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served)
 	struct bfq_entity *entity = &bfqq->entity;
 	struct bfq_service_tree *st;
 
+	if (!bfqq->service_from_backlogged)
+		bfqq->first_IO_time = jiffies;
+
+	bfqq->service_from_backlogged += served;
 	for_each_entity(entity) {
 		st = bfq_entity_service_tree(entity);