From patchwork Sun Mar 10 18:11:34 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Paolo Valente <paolo.valente@linaro.org>
X-Patchwork-Id: 160031
Delivered-To: patch@linaro.org
Received: by 2002:a02:5cc1:0:0:0:0:0 with SMTP id w62csp11090194jad;
 Sun, 10 Mar 2019 11:12:37 -0700 (PDT)
X-Google-Smtp-Source: APXvYqyBMaioCvFttoI2rLe8ZqhHz1qXD66TNqX52s9BwUwfQ8uFHyvS3aMZDAoG7IBtIKuUC9O+
X-Received: by 2002:a17:902:168:: with SMTP id
 95mr7483684plb.212.1552241557119; 
 Sun, 10 Mar 2019 11:12:37 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1552241557; cv=none;
 d=google.com; s=arc-20160816;
 b=e4EvIqdaL/Xse3GdzGroPFsqbMXkc441Bam7GTRNvsXG8YAlbgcqg5oTtlHnzhyPHv
 5OIF3XiYMUQyBnY2Di0K4qcjwidSWfpXLV9h/TyS5WqKpiFS2pcK2FlLcXFjlI3KXF/9
 FnbnGTQj+ALE6QcBrtUsHRwNm5caVBrsOyRMQOidt1nACLeLB/A1Q3sA4SfuO9qh5nvd
 5jsDg+F3eOedWIhZFwGUdRYsL7VEI9dq9rGl8mY8M8V1Ym6GkbAwHL6qbhG+DOE5LN9t
 sNXmQNtrRHVcHxacwKEW4kaV7Owb/qFyjdcnrPMABvpDvMBLJmyfZezUyWJqgVOAeiBF
 bkqw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816; 
 h=list-id:precedence:sender:content-transfer-encoding:mime-version
 :references:in-reply-to:message-id:date:subject:cc:to:from
 :dkim-signature;
 bh=7PDjwfDL/90N589WNTZ9R1L1ulLSIazVR8UDi7lsijE=;
 b=o5sVgL0edUlmCP52o1zXo9uxlSZcnA5Kl2xo2K2303zayCCM8QGK/UkFkL9fBtfHFf
 H8GeGWp+pyCzjCdjODA+5Zl6okTtCIjlbKKmFvboUMsuf9kE8sGiLDKXJZ5KrJKRbeKG
 eFmslQmOavdGaKv49j2O3chl01U+6MuuLRwjggImsh07MRMrhYZsdo0mBTEXC/jspPvk
 Ksem7jVgMNY7CrIKzMCe/YPAgZ9hJPuBiv1EpDFpACrHFKo5kGc2iTlxr2aiFSh4DoeU
 HluR8RK+2xOkCTJcSyQt7LEwlYcx1Yx+m69DoX5pRpOD+KZ9nJldvl8P/0izT+LN+URF
 0vnw==
ARC-Authentication-Results: i=1; mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=Lr4ertf0;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id x1si3521239plb.245.2019.03.10.11.12.36;
 Sun, 10 Mar 2019 11:12:37 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 dkim=pass header.i=@linaro.org header.s=google header.b=Lr4ertf0;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1726982AbfCJSMa (ORCPT <rfc822;mike.holmes@linaro.org>
 + 31 others); Sun, 10 Mar 2019 14:12:30 -0400
Received: from mail-wr1-f68.google.com ([209.85.221.68]:41288 "EHLO
 mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1726915AbfCJSMY (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Sun, 10 Mar 2019 14:12:24 -0400
Received: by mail-wr1-f68.google.com with SMTP id n2so2639890wrw.8
 for <linux-kernel@vger.kernel.org>;
 Sun, 10 Mar 2019 11:12:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=7PDjwfDL/90N589WNTZ9R1L1ulLSIazVR8UDi7lsijE=;
 b=Lr4ertf0tzp5hH0U25sR+JkJzcx66Ugrnyw3/M1VV8GpvKJ+srbt6fPHavCY+BIWXY
 zSDDAHt8Dbmdch71AW3EN9EesYKa4GWuIna+Z8H4tBUaAPvo/hTcIy1gwIsifbV7zcyT
 VMSuiE1aXZkhP43vwfIWg/DfMK8ND7/njZSkh4rnokOcxi6TmiY0JkR812RZjk23T5F2
 t87hNhejsy4WZo76g5XJX78Am5+W1eaJFEhgAv8XGjkriqcUtKBPXDRu3JJ8RJsPkzYc
 V9ymFUgmxysjmid36aZEOfqIp/e/1MNweHbxJHgRo3CZ9cX1fIm2fbMLEc7LC8geg9b8
 kiKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=7PDjwfDL/90N589WNTZ9R1L1ulLSIazVR8UDi7lsijE=;
 b=lRGjEyiMbcQ/BlhCaG36JnQoK1o6yrcitTlSfs48yrM4ysy2zSNOAFgM34edfrMTwg
 NUz1I/vQOtGwJ7MaYC0OWN2mA40yOoLqaqanjfKUh8ETSuM0PxPHAK4YdlvQsxQGatsw
 Jtj5/lvBz5GreWl5UL5FaNGcRgdm0EzubEg/O0F7n7Q9mHKNqulMvPpPYj6TddUcZXMJ
 IU5QeYyZc4lKK0oM3efe9iygJ/Kai9eKNPGq9BZztLlHh4t7XLxhMa59ifRUyj+S9sAM
 0Vkqz9UwVb0cBXZtGkDgTUAFmilCDMfmN2MDHgrlQf7a0y50+hZgl1DPecLM59E50fg5
 BSxQ==
X-Gm-Message-State: APjAAAV7o0ymJ33lbAErWUehdm30bjIuzebSzMtjSp5KwvRlvfBD6vfX
 5bu/s6H2+RPnQs5sTiddOa64D6CTiWc=
X-Received: by 2002:adf:e64c:: with SMTP id b12mr8935701wrn.112.1552241542338; 
 Sun, 10 Mar 2019 11:12:22 -0700 (PDT)
Received: from localhost.localdomain (146-241-67-113.dyn.eolo.it.
 [146.241.67.113]) by smtp.gmail.com with ESMTPSA id
 d206sm24906368wmc.11.2019.03.10.11.12.20
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Sun, 10 Mar 2019 11:12:21 -0700 (PDT)
From: Paolo Valente <paolo.valente@linaro.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
 ulf.hansson@linaro.org, linus.walleij@linaro.org,
 broonie@kernel.org, bfq-iosched@googlegroups.com,
 oleksandr@natalenko.name, fra.fra.800@gmail.com,
 alessio.masola@gmail.com, Paolo Valente <paolo.valente@linaro.org>
Subject: [PATCH BUGFIX IMPROVEMENT V2 6/9] block,
 bfq: always protect newly-created queues from existing active queues
Date: Sun, 10 Mar 2019 19:11:34 +0100
Message-Id: <20190310181137.2604-7-paolo.valente@linaro.org>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20190310181137.2604-1-paolo.valente@linaro.org>
References: <20190310181137.2604-1-paolo.valente@linaro.org>
MIME-Version: 1.0
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

If many bfq_queues belonging to the same group happen to be created
shortly after each other, then the processes associated with these
queues have typically a common goal. In particular, bursts of queue
creations are usually caused by services or applications that spawn
many parallel threads/processes. Examples are systemd during boot, or
git grep. If there are no other active queues, then, to help these
processes get their job done as soon as possible, the best thing to do
is to reach a high throughput. To this goal, it is usually better to
not grant either weight-raising or device idling to the queues
associated with these processes. And this is exactly what BFQ
currently does.

There is however a drawback: if, in contrast, some other queues are
already active, then the newly created queues must be protected from
the I/O flowing through the already existing queues. In this case, the
best thing to do is the opposite as in the other case: it is much
better to grant weight-raising and device idling to the newly-created
queues, if they deserve it. This commit addresses this issue by doing
so if there are already other active queues.

This change also helps eliminating false positives, which occur when
the newly-created queues do not belong to an actual large burst of
creations, but some background task (e.g., a service) happens to
trigger the creation of new queues in the middle, i.e., very close to
when the victim queues are created. These false positive may cause
total loss of control on process latencies.

Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
---
 block/bfq-iosched.c | 64 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 51 insertions(+), 13 deletions(-)

-- 
2.20.1

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index d34b80e5c47d..500b04df9efa 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -1075,8 +1075,18 @@ static void bfq_reset_burst_list(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 
 	hlist_for_each_entry_safe(item, n, &bfqd->burst_list, burst_list_node)
 		hlist_del_init(&item->burst_list_node);
-	hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
-	bfqd->burst_size = 1;
+
+	/*
+	 * Start the creation of a new burst list only if there is no
+	 * active queue. See comments on the conditional invocation of
+	 * bfq_handle_burst().
+	 */
+	if (bfq_tot_busy_queues(bfqd) == 0) {
+		hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
+		bfqd->burst_size = 1;
+	} else
+		bfqd->burst_size = 0;
+
 	bfqd->burst_parent_entity = bfqq->entity.parent;
 }
 
@@ -1132,7 +1142,8 @@ static void bfq_add_to_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
  * many parallel threads/processes. Examples are systemd during boot,
  * or git grep. To help these processes get their job done as soon as
  * possible, it is usually better to not grant either weight-raising
- * or device idling to their queues.
+ * or device idling to their queues, unless these queues must be
+ * protected from the I/O flowing through other active queues.
  *
  * In this comment we describe, firstly, the reasons why this fact
  * holds, and, secondly, the next function, which implements the main
@@ -1144,7 +1155,10 @@ static void bfq_add_to_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
  * cumulatively served, the sooner the target job of these queues gets
  * completed. As a consequence, weight-raising any of these queues,
  * which also implies idling the device for it, is almost always
- * counterproductive. In most cases it just lowers throughput.
+ * counterproductive, unless there are other active queues to isolate
+ * these new queues from. If there no other active queues, then
+ * weight-raising these new queues just lowers throughput in most
+ * cases.
  *
  * On the other hand, a burst of queue creations may be caused also by
  * the start of an application that does not consist of a lot of
@@ -1178,14 +1192,16 @@ static void bfq_add_to_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
  * are very rare. They typically occur if some service happens to
  * start doing I/O exactly when the interactive task starts.
  *
- * Turning back to the next function, it implements all the steps
- * needed to detect the occurrence of a large burst and to properly
- * mark all the queues belonging to it (so that they can then be
- * treated in a different way). This goal is achieved by maintaining a
- * "burst list" that holds, temporarily, the queues that belong to the
- * burst in progress. The list is then used to mark these queues as
- * belonging to a large burst if the burst does become large. The main
- * steps are the following.
+ * Turning back to the next function, it is invoked only if there are
+ * no active queues (apart from active queues that would belong to the
+ * same, possible burst bfqq would belong to), and it implements all
+ * the steps needed to detect the occurrence of a large burst and to
+ * properly mark all the queues belonging to it (so that they can then
+ * be treated in a different way). This goal is achieved by
+ * maintaining a "burst list" that holds, temporarily, the queues that
+ * belong to the burst in progress. The list is then used to mark
+ * these queues as belonging to a large burst if the burst does become
+ * large. The main steps are the following.
  *
  * . when the very first queue is created, the queue is inserted into the
  *   list (as it could be the first queue in a possible burst)
@@ -5695,7 +5711,29 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
 		}
 	}
 
-	if (unlikely(bfq_bfqq_just_created(bfqq)))
+	/*
+	 * Consider bfqq as possibly belonging to a burst of newly
+	 * created queues only if:
+	 * 1) A burst is actually happening (bfqd->burst_size > 0)
+	 * or
+	 * 2) There is no other active queue. In fact, if, in
+	 *    contrast, there are active queues not belonging to the
+	 *    possible burst bfqq may belong to, then there is no gain
+	 *    in considering bfqq as belonging to a burst, and
+	 *    therefore in not weight-raising bfqq. See comments on
+	 *    bfq_handle_burst().
+	 *
+	 * This filtering also helps eliminating false positives,
+	 * occurring when bfqq does not belong to an actual large
+	 * burst, but some background task (e.g., a service) happens
+	 * to trigger the creation of new queues very close to when
+	 * bfqq and its possible companion queues are created. See
+	 * comments on bfq_handle_burst() for further details also on
+	 * this issue.
+	 */
+	if (unlikely(bfq_bfqq_just_created(bfqq) &&
+		     (bfqd->burst_size > 0 ||
+		      bfq_tot_busy_queues(bfqd) == 0)))
 		bfq_handle_burst(bfqd, bfqq);
 
 	return bfqq;