From patchwork Mon Nov 13 06:34:07 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 118713 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp1444837qgn; Sun, 12 Nov 2017 22:34:37 -0800 (PST) X-Google-Smtp-Source: AGs4zMZAiOsfiZbcWmGuuonvYjxjGrD4BM2xXP8VTfIIFxRmgICEfFrqLaBoX3wkQ2+4EUqkFTpF X-Received: by 10.159.203.197 with SMTP id r5mr7965736plo.431.1510554877704; Sun, 12 Nov 2017 22:34:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510554877; cv=none; d=google.com; s=arc-20160816; b=NtS6tVxmivFhEVDEWKxO0z6pXNFMWMkLFOasxpQLoMSilHL+wb+GZ71+Af1Vx2KKK5 O1tEBKrGO9qwCLHgWt0ymzN8AH2nZM2YW0Wch/Mk0M+7iykHJE1/2Oyp2FXhhD2Aj7Gg Dee8ZTFEx3E2c5iplZBmZt80W6Fk8zjsOrz56aShkKFPwZ3kq6Uzkj3IYgU7/pBZXgzz P6T07adc1QUsZrFvun4bYiOSl2EDgynNdMuJndm7ir8920ClgaSx4KOPGO/Am28hYCVr 9UKCg4dqknPji+4FNdEhSUShLggiq53fzZeG25UFpcrRwCVGyzgbsvwURpc9xHrXYEkw Ib3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=0C09xARCq5lRGObZ7ietUdh2leD6jSM28GjTpi/l40E=; b=z8rxpHPpbgyX8O9yvIMdBnCLVrYL+bEWLFidUgdIQCh9jqf5JEDZChj0QRQjyMoHbs pjsY37U21sitjtRmtjsOcQFWG2d9ELwkie9l7DlXBWHTW7TRWoSPSZVMhog3S8aOyXjc PB8VYFZmZisWRz02s8t+Lxo5IbYT2DRnV3abKumpQPhdxsvrzQb/+6rCJL9xT83fgCUA 7II/EmoGwKQhrrE/LUHJUyDRYyayCFkP/k6RxUjci6jAAnqpyXd2I8mn3CQeD0s4QQN8 fOvvzNJModvW1V4Q1IEEb+27SGyhQHOb7N6qvjMnwHwLW/i8dPvOAdKGYDID3ENHvKI2 fh6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ZR+Jg9sa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h86si14813467pfj.24.2017.11.12.22.34.37; Sun, 12 Nov 2017 22:34:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ZR+Jg9sa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751911AbdKMGef (ORCPT + 27 others); Mon, 13 Nov 2017 01:34:35 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:54324 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751547AbdKMGeb (ORCPT ); Mon, 13 Nov 2017 01:34:31 -0500 Received: by mail-wm0-f68.google.com with SMTP id r68so12800636wmr.3 for ; Sun, 12 Nov 2017 22:34:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0C09xARCq5lRGObZ7ietUdh2leD6jSM28GjTpi/l40E=; b=ZR+Jg9saLT8XW9jz8jxIz9lwIbKbX1DjVBVq1DBPcCKlkyiVI6PzrY9Cz+Pqpsr72e ny0yAWe2ZINLSMgh1s0uvxy5FihMNMr3N41yC41uwMRTPiERm88ZFA9IZ5xvbrGLwo7f lkSmBAeIwshVHFou0pa11o9VNSDM+YXbXENQo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0C09xARCq5lRGObZ7ietUdh2leD6jSM28GjTpi/l40E=; b=CjaJnONrtkteTygnaml+4ge/TuTLH76XiVrMpS5p/e2DXLHAMJtcQ20Tkg7QeYZJqT xdF4oanpwGxDS1rFsBaBSfnwQTKADVPd+o8i5Nz/O/ne+YS+Rs7+1iO/pw0ggrGaRe4d 8BUeTBWWPCgfnSYnPbcTWGdT8DnB4XUfTFRCjKb44ZhGi9ERxRJ9ccr4L1snE+XgICcb uxj7vqgAGPtEHjsF0GES255ib5FiuVClI4m4xoxC8iY8OP3r9TZqqkM9ziHjw5I2zCEL vf9H1G5ZWx+jtmawzuuetX8HEtbvvbc45ceSA+R64QdBolVbAD+mLGqT8U30KhvXDOU6 rc3w== X-Gm-Message-State: AJaThX5y4CqO8xXGAwZ6gm46y+lJRsifS5XX+4bRMkfYicgK06pubOAb 0aOZp5CDYELW/P01QK8L7LNcag== X-Received: by 10.28.6.6 with SMTP id 6mr5688560wmg.114.1510554870302; Sun, 12 Nov 2017 22:34:30 -0800 (PST) Received: from localhost.localdomain ([5.169.167.83]) by smtp.gmail.com with ESMTPSA id 10sm11158429wml.27.2017.11.12.22.34.28 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 12 Nov 2017 22:34:29 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, lee.tibbert@gmail.com, oleksandr@natalenko.name, lucmiccio@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 1/4] doc, block, bfq: update max IOPS sustainable with BFQ Date: Mon, 13 Nov 2017 07:34:07 +0100 Message-Id: <20171113063410.3029-2-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171113063410.3029-1-paolo.valente@linaro.org> References: <20171113063410.3029-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We have investigated more deeply the performance of BFQ, in terms of number of IOPS that can be processed by the CPU when BFQ is used as I/O scheduler. In more detail, using the script [1], we have measured the number of IOPS reached on top of a null block device configured with zero latency, as a function of the workload (sequential read, sequential write, random read, random write) and of the system (we considered desktops, laptops and embedded systems). Basing on the resulting figures, with this commit we update the current, conservative IOPS range reported in BFQ documentation. In particular, the documentation now reports, for each of three different systems, the lowest number of IOPS obtained for that system with the above test (namely, the value obtained with the workload leading to the lowest IOPS). [1] https://github.com/Algodev-github/IOSpeed Reviewed-by: Lee Tibbert Signed-off-by: Paolo Valente Signed-off-by: Luca Miccio --- Documentation/block/bfq-iosched.txt | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) -- 2.10.0 diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 3d6951d..7a93615 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -20,12 +20,17 @@ for that device, by setting low_latency to 0. See Section 3 for details on how to configure BFQ for the desired tradeoff between latency and throughput, or on how to maximize throughput. -On average CPUs, the current version of BFQ can handle devices -performing at most ~30K IOPS; at most ~50 KIOPS on faster CPUs. As a -reference, 30-50 KIOPS correspond to very high bandwidths with -sequential I/O (e.g., 8-12 GB/s if I/O requests are 256 KB large), and -to 120-200 MB/s with 4KB random I/O. BFQ is currently being tested on -multi-queue devices too. +BFQ has a non-null overhead, which limits the maximum IOPS that the +CPU can process for a device scheduled with BFQ. To give an idea of +the limits on slow or average CPUs, here are BFQ limits for three +different CPUs, on, respectively, an average laptop, an old desktop, +and a cheap embedded system, in case full hierarchical support is +enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set): +- Intel i7-4850HQ: 250 KIOPS +- AMD A8-3850: 170 KIOPS +- ARM CortexTM-A53 Octa-core: 45 KIOPS + +BFQ works for multi-queue devices too. The table of contents follow. Impatients can just jump to Section 3. From patchwork Mon Nov 13 06:34:08 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 118716 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp1445302qgn; Sun, 12 Nov 2017 22:35:14 -0800 (PST) X-Google-Smtp-Source: AGs4zMYRtFoxt2Cx5m9dTfpep2qgG4SMOgjPmAjHd7h9dfxDorAEkmjRCMV4pehGY3AP25L3ckZ4 X-Received: by 10.159.218.9 with SMTP id v9mr7713001plp.92.1510554914444; Sun, 12 Nov 2017 22:35:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510554914; cv=none; d=google.com; s=arc-20160816; b=bgW7pmElDvMhXzJMG0pwyyPg6CdrEdeUgEEFriwbe7mfpUflVmlxXNPLiGxZ9oLiGo KGlqtOCKvIASXNSkZfcWQnuurjlQfl9Q0tmignlmK65EDnt7tjzMxOxwtxqQv8Qgkd1W 7qzT+Wm2KU9693ToW08YIiI08TYSeRB7Ct4Uizqq+HvVAjIW4+bT2sf/vLBo/BAcCGCx 041ua+JJLqYPWjmCEogWzMP2bPIezgIrMnzr15zWNUsnWcA6pOly30/5mntUgcr1J12S b7bb28iliRlipWHOK1Fg9AU5UH9rNUtPguEdbbNBM/vutu+cbS811xQaM4olpJfhrMpZ gFdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=qotsggCjIz8ohM2ZdCNsRBOSXkJox9H2y62UQxw4YLg=; b=ogZuMDdqJ6pdKHkk8Whry4UWvfcYgxLmA+Arm/pFECNYgxoxVXQiv15lnZvcDEnn4O vUcEHCafaZGZDbeiMnKpftdWrgrGGslBRFJzZp4Q8xULSuxddH6TBYEZ8pBbVl2B9G2I rVbXZg3eU4AAtBNhUc5iWsXB9MQIfsJdVh0Rms0UvzTVtsOebUk0/tHBgzV8rY3Y1UWO XTkVppymTOZR99kw7pMZQEkAItQYpjTKGX8055rzcdwDC1xuujW7SO+ZQ7itdtMU+lRw YP6I/IoLqSnWGob99RPNaQBojsLkIh8yadESVWg/nnFu8jFQIaauApW74QReMeUUmVhx O4fQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kT70Py42; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h185si13031390pgc.164.2017.11.12.22.35.14; Sun, 12 Nov 2017 22:35:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=kT70Py42; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752045AbdKMGfM (ORCPT + 27 others); Mon, 13 Nov 2017 01:35:12 -0500 Received: from mail-wr0-f193.google.com ([209.85.128.193]:43748 "EHLO mail-wr0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751809AbdKMGee (ORCPT ); Mon, 13 Nov 2017 01:34:34 -0500 Received: by mail-wr0-f193.google.com with SMTP id 4so13452846wrt.0 for ; Sun, 12 Nov 2017 22:34:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qotsggCjIz8ohM2ZdCNsRBOSXkJox9H2y62UQxw4YLg=; b=kT70Py42QThSiU8L09cTbbnv4PUCtGrrj+n7nllEHwcGjeWKeCdx8Dj7y4VfjmIvB7 pKB6ZewAGZh/o5ASjEMgSiihiIbAOgLc0MeSZjGpY8QhEuPw4TxQyxmY+fO1VGrJsv3E 4sAkvVeD5IDUg88iVZBTHoLbwB2QyZAScLo+8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qotsggCjIz8ohM2ZdCNsRBOSXkJox9H2y62UQxw4YLg=; b=UPShQpRG/4e3P6nrvZhFBDG04yYQdIfroUjqCyXzMEkz8zQge5DXqbJnb5JKCDGppX 1ybIW8vno5E3Dmb3xS4NBMOMZ6R0KG9N5bemAU/8dJxk82sVv93j4xuiA770Z1bgrYQO hq7lzythnLNGGDqX2LmwtFIilp2EA7uuxP4IslF4dudrudrL2ZMqc+ZFOJ/EYXFslNyq kNc0g6UV7ZbuYa2trKZLe1qTKb5PidIw1DXWcNF05UQTHTnrZm7UpdSVQnFd0W2pkl25 NH52wgXBB3F/m1Zeylor3eEGSo31nI/IY/HVbInJIyFD2MuJkMuC6GN94oixY9h5KD/M emvQ== X-Gm-Message-State: AJaThX73kGtIujF7gd+vxw6mkLyVfS+N+vzouutJC71jMed69D/H7uM3 SLHoHijxf9Dur40yN2iOhg5YUg== X-Received: by 10.223.128.3 with SMTP id 3mr5765617wrk.196.1510554872703; Sun, 12 Nov 2017 22:34:32 -0800 (PST) Received: from localhost.localdomain ([5.169.167.83]) by smtp.gmail.com with ESMTPSA id 10sm11158429wml.27.2017.11.12.22.34.30 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 12 Nov 2017 22:34:31 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, lee.tibbert@gmail.com, oleksandr@natalenko.name, lucmiccio@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 2/4] block, bfq: add missing invocations of bfqg_stats_update_io_add/remove Date: Mon, 13 Nov 2017 07:34:08 +0100 Message-Id: <20171113063410.3029-3-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171113063410.3029-1-paolo.valente@linaro.org> References: <20171113063410.3029-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Luca Miccio bfqg_stats_update_io_add and bfqg_stats_update_io_remove are to be invoked, respectively, when an I/O request enters and when an I/O request exits the scheduler. Unfortunately, bfq does not fully comply with this scheme, because it does not invoke these functions for requests that are inserted into or extracted from its priority dispatch list. This commit fixes this mistake. Tested-by: Lee Tibbert Tested-by: Oleksandr Natalenko Signed-off-by: Paolo Valente Signed-off-by: Luca Miccio --- block/bfq-iosched.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) -- 2.10.0 diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 889a854..91703eb 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -1359,7 +1359,6 @@ static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd, bfqq->ttime.last_end_request + bfqd->bfq_slice_idle * 3; - bfqg_stats_update_io_add(bfqq_group(RQ_BFQQ(rq)), bfqq, rq->cmd_flags); /* * bfqq deserves to be weight-raised if: @@ -1633,7 +1632,6 @@ static void bfq_remove_request(struct request_queue *q, if (rq->cmd_flags & REQ_META) bfqq->meta_pending--; - bfqg_stats_update_io_remove(bfqq_group(bfqq), rq->cmd_flags); } static bool bfq_bio_merge(struct blk_mq_hw_ctx *hctx, struct bio *bio) @@ -1746,6 +1744,7 @@ static void bfq_requests_merged(struct request_queue *q, struct request *rq, bfqq->next_rq = rq; bfq_remove_request(q, next); + bfqg_stats_update_io_remove(bfqq_group(bfqq), next->cmd_flags); spin_unlock_irq(&bfqq->bfqd->lock); end: @@ -3700,6 +3699,9 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) spin_lock_irq(&bfqd->lock); rq = __bfq_dispatch_request(hctx); + if (rq && RQ_BFQQ(rq)) + bfqg_stats_update_io_remove(bfqq_group(RQ_BFQQ(rq)), + rq->cmd_flags); spin_unlock_irq(&bfqd->lock); return rq; @@ -4224,6 +4226,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, { struct request_queue *q = hctx->queue; struct bfq_data *bfqd = q->elevator->elevator_data; + struct bfq_queue *bfqq = RQ_BFQQ(rq); spin_lock_irq(&bfqd->lock); if (blk_mq_sched_try_insert_merge(q, rq)) { @@ -4243,6 +4246,12 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, list_add_tail(&rq->queuelist, &bfqd->dispatch); } else { __bfq_insert_request(bfqd, rq); + /* + * Update bfqq, because, if a queue merge has occurred + * in __bfq_insert_request, then rq has been + * redirected into a new queue. + */ + bfqq = RQ_BFQQ(rq); if (rq_mergeable(rq)) { elv_rqhash_add(q, rq); @@ -4251,6 +4260,9 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, } } + if (bfqq) + bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, rq->cmd_flags); + spin_unlock_irq(&bfqd->lock); } @@ -4428,8 +4440,11 @@ static void bfq_finish_request(struct request *rq) * lock is held. */ - if (!RB_EMPTY_NODE(&rq->rb_node)) + if (!RB_EMPTY_NODE(&rq->rb_node)) { bfq_remove_request(rq->q, rq); + bfqg_stats_update_io_remove(bfqq_group(bfqq), + rq->cmd_flags); + } bfq_put_rq_priv_body(bfqq); } From patchwork Mon Nov 13 06:34:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 118714 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp1444914qgn; Sun, 12 Nov 2017 22:34:43 -0800 (PST) X-Google-Smtp-Source: AGs4zMbZpS0VvKDbdqco7HW/tiwrFzRfc9zv6TbRbvMFD77d2txlQ5RwuJaD84jcF+x0foFSyS7b X-Received: by 10.101.69.8 with SMTP id n8mr7744527pgq.79.1510554883398; Sun, 12 Nov 2017 22:34:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510554883; cv=none; d=google.com; s=arc-20160816; b=BYNykmB7fugOjUpQOnmrz6mKk6bQNu7hREDV2G2xxHCLJ1RQxS0nYukVe3O7pgdm0y uE1wH7GOrYyzPzA6JsxHuNcD3Rxw4KkdC496CLOIOkIIWvAt6B7Il8EUeuBmnIZbU0Dn d+Nv8ER67OiyZSyCemQtgBlAk1lKcMWvWhKh+6kL/qX54G5fsSEx2HupL6lSe/M+CoIT wUGhwtEogS7fNZOpwGrsmBTO14RsYYXdd6q7wCqaMVi+Fd5nqWJ3HFo+XxTw0GycEXLU XfaeDNNeT5KeZ9c/enWlwROiyaGngdnxdrvJIPxDC8V2YRhuPG2tvLH7qpf0jPOUsMSY gwbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=WuaQxw3TCS4IXiGvGtjhUYOyurp6YkKedLTlak4v0gyjkBmHltXqHBnihO37v599ly qc0bdx0T/Ac4066VdX5beP7+wAU+DYs1I9ZH2xjzkj2u8o6W+UcNLWWWZ/H3Qgv5SgOa kg6Z9V6k8EQSwh2/focV/HoblyMKoDhCgfpCO+pa90KpCuymPB6B9hSghmsowJsoZ8vw Z2VWAw10WhYPVFKalwU3yshFeRLR5nlnJe33el1uX5NyLawlbe7+zhMj4LTTKGtke7/h AvdpjGOsTeojUsSKQAkQaifT3//yr694AIHPnVTIajK62f1t/tqgNjAN4+/yJ51CnYv6 fFLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IY1HXTD2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h86si14813467pfj.24.2017.11.12.22.34.43; Sun, 12 Nov 2017 22:34:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IY1HXTD2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751979AbdKMGek (ORCPT + 27 others); Mon, 13 Nov 2017 01:34:40 -0500 Received: from mail-wr0-f194.google.com ([209.85.128.194]:56185 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751924AbdKMGeg (ORCPT ); Mon, 13 Nov 2017 01:34:36 -0500 Received: by mail-wr0-f194.google.com with SMTP id l8so13414903wre.12 for ; Sun, 12 Nov 2017 22:34:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=IY1HXTD27Yqqtm3i8B4epgD91AEW/gA04sb+qr+caQQ18pDndpHab9B5KitVHCxhUc 0Na4P13IdxKF/dJ4eo7sN9oTDTW/vwWtmSm78UBit0/YFjAqwkf4fSm/NUg1gAnBafF0 xC7jMyGyGTEBhjBAbGu9bG4nEGpgUu/j+TTdQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cjHAdRZqcgI6QTmLCfaOoDBzbp6bb2VrC9Wdjh7Zbiw=; b=CG+d1egmTOW6cNulTZUApqyN7/olxRcmxMf9wctmPXNfr2+iX1HjeM3y1xVVIMZSck k75GiAV+UMtXofBnIEO+rbgUoy6PUhflsyjEZxoUympAiPukL3Cg94WukW2bwbM81eg/ FUJQ/Xk0I1VQCNA+kn63MTKhPMjGCMkedDX2ozt49eWHxeuRPgV5u/eJgut3kCss2c06 MnRSWQilUb+JSMUFIdEhO6lAWzAqD132UqGk+mJgZYqQbBWb6fXUV2481Q278+gNiFlI 1Y1u+wELQaZ6YS7IE/tudqLl/JpxS04n9JCRrTzupu+s5HFco+bgi68gMvsezIvcMpWF jD5A== X-Gm-Message-State: AJaThX4VgQpNkVTfOCAHRzSQzBzxjYGKWgulrjutyXoiH9e9yde7jpfb VKZL37+QMs632CAADCZNoI20xA== X-Received: by 10.223.145.166 with SMTP id 35mr5870498wri.51.1510554874743; Sun, 12 Nov 2017 22:34:34 -0800 (PST) Received: from localhost.localdomain ([5.169.167.83]) by smtp.gmail.com with ESMTPSA id 10sm11158429wml.27.2017.11.12.22.34.32 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 12 Nov 2017 22:34:34 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, lee.tibbert@gmail.com, oleksandr@natalenko.name, lucmiccio@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 3/4] block, bfq: update blkio stats outside the scheduler lock Date: Mon, 13 Nov 2017 07:34:09 +0100 Message-Id: <20171113063410.3029-4-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171113063410.3029-1-paolo.valente@linaro.org> References: <20171113063410.3029-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org bfq invokes various blkg_*stats_* functions to update the statistics contained in the special files blkio.bfq.* in the blkio controller groups, i.e., the I/O accounting related to the proportional-share policy provided by bfq. The execution of these functions takes a considerable percentage, about 40%, of the total per-request execution time of bfq (i.e., of the sum of the execution time of all the bfq functions that have to be executed to process an I/O request from its creation to its destruction). This reduces the request-processing rate sustainable by bfq noticeably, even on a multicore CPU. In fact, the bfq functions that invoke blkg_*stats_* functions cannot be executed in parallel with the rest of the code of bfq, because both are executed under the same same per-device scheduler lock. To reduce this slowdown, this commit moves, wherever possible, the invocation of these functions (more precisely, of the bfq functions that invoke blkg_*stats_* functions) outside the critical sections protected by the scheduler lock. With this change, and with all blkio.bfq.* statistics enabled, the throughput grows, e.g., from 250 to 310 KIOPS (+25%) on an Intel i7-4850HQ, in case of 8 threads doing random I/O in parallel on null_blk, with the latter configured with 0 latency. We obtained the same or higher throughput boosts, up to +30%, with other processors (some figures are reported in the documentation). For our tests, we used the script [1], with which our results can be easily reproduced. NOTE. This commit still protects the invocation of blkg_*stats_* functions with the request_queue lock, because the group these functions are invoked on may otherwise disappear before or while these functions are executed. Fortunately, tests without even this lock show, by difference, that the serialization caused by this lock has a little impact (at most ~5% of throughput reduction). [1] https://github.com/Algodev-github/IOSpeed Tested-by: Lee Tibbert Tested-by: Oleksandr Natalenko Signed-off-by: Paolo Valente Signed-off-by: Luca Miccio --- Documentation/block/bfq-iosched.txt | 6 +- block/bfq-iosched.c | 110 ++++++++++++++++++++++++++++++++---- block/bfq-wf2q.c | 1 - 3 files changed, 102 insertions(+), 15 deletions(-) -- 2.10.0 diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 7a93615..7fad6c0 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -26,9 +26,9 @@ the limits on slow or average CPUs, here are BFQ limits for three different CPUs, on, respectively, an average laptop, an old desktop, and a cheap embedded system, in case full hierarchical support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set): -- Intel i7-4850HQ: 250 KIOPS -- AMD A8-3850: 170 KIOPS -- ARM CortexTM-A53 Octa-core: 45 KIOPS +- Intel i7-4850HQ: 310 KIOPS +- AMD A8-3850: 200 KIOPS +- ARM CortexTM-A53 Octa-core: 56 KIOPS BFQ works for multi-queue devices too. diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 91703eb..69e05f8 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2228,7 +2228,6 @@ static void __bfq_set_in_service_queue(struct bfq_data *bfqd, struct bfq_queue *bfqq) { if (bfqq) { - bfqg_stats_update_avg_queue_size(bfqq_group(bfqq)); bfq_clear_bfqq_fifo_expire(bfqq); bfqd->budgets_assigned = (bfqd->budgets_assigned * 7 + 256) / 8; @@ -3469,7 +3468,6 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd) */ bfq_clear_bfqq_wait_request(bfqq); hrtimer_try_to_cancel(&bfqd->idle_slice_timer); - bfqg_stats_update_idle_time(bfqq_group(bfqq)); } goto keep_queue; } @@ -3695,15 +3693,67 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) { struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; struct request *rq; +#ifdef CONFIG_BFQ_GROUP_IOSCHED + struct bfq_queue *in_serv_queue, *bfqq; + bool waiting_rq, idle_timer_disabled; +#endif spin_lock_irq(&bfqd->lock); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + in_serv_queue = bfqd->in_service_queue; + waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue); + + rq = __bfq_dispatch_request(hctx); + + idle_timer_disabled = + waiting_rq && !bfq_bfqq_wait_request(in_serv_queue); + +#else rq = __bfq_dispatch_request(hctx); - if (rq && RQ_BFQQ(rq)) - bfqg_stats_update_io_remove(bfqq_group(RQ_BFQQ(rq)), - rq->cmd_flags); +#endif spin_unlock_irq(&bfqd->lock); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + bfqq = rq ? RQ_BFQQ(rq) : NULL; + if (!idle_timer_disabled && !bfqq) + return rq; + + /* + * rq and bfqq are guaranteed to exist until this function + * ends, for the following reasons. First, rq can be + * dispatched to the device, and then can be completed and + * freed, only after this function ends. Second, rq cannot be + * merged (and thus freed because of a merge) any longer, + * because it has already started. Thus rq cannot be freed + * before this function ends, and, since rq has a reference to + * bfqq, the same guarantee holds for bfqq too. + * + * In addition, the following queue lock guarantees that + * bfqq_group(bfqq) exists as well. + */ + spin_lock_irq(hctx->queue->queue_lock); + if (idle_timer_disabled) + /* + * Since the idle timer has been disabled, + * in_serv_queue contained some request when + * __bfq_dispatch_request was invoked above, which + * implies that rq was picked exactly from + * in_serv_queue. Thus in_serv_queue == bfqq, and is + * therefore guaranteed to exist because of the above + * arguments. + */ + bfqg_stats_update_idle_time(bfqq_group(in_serv_queue)); + if (bfqq) { + struct bfq_group *bfqg = bfqq_group(bfqq); + + bfqg_stats_update_avg_queue_size(bfqg); + bfqg_stats_set_start_empty_time(bfqg); + bfqg_stats_update_io_remove(bfqg, rq->cmd_flags); + } + spin_unlock_irq(hctx->queue->queue_lock); +#endif + return rq; } @@ -4161,7 +4211,6 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, */ bfq_clear_bfqq_wait_request(bfqq); hrtimer_try_to_cancel(&bfqd->idle_slice_timer); - bfqg_stats_update_idle_time(bfqq_group(bfqq)); /* * The queue is not empty, because a new request just @@ -4176,10 +4225,12 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq, } } -static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) +/* returns true if it causes the idle timer to be disabled */ +static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) { struct bfq_queue *bfqq = RQ_BFQQ(rq), *new_bfqq = bfq_setup_cooperator(bfqd, bfqq, rq, true); + bool waiting, idle_timer_disabled = false; if (new_bfqq) { if (bic_to_bfqq(RQ_BIC(rq), 1) != bfqq) @@ -4213,12 +4264,16 @@ static void __bfq_insert_request(struct bfq_data *bfqd, struct request *rq) bfqq = new_bfqq; } + waiting = bfqq && bfq_bfqq_wait_request(bfqq); bfq_add_request(rq); + idle_timer_disabled = waiting && !bfq_bfqq_wait_request(bfqq); rq->fifo_time = ktime_get_ns() + bfqd->bfq_fifo_expire[rq_is_sync(rq)]; list_add_tail(&rq->queuelist, &bfqq->fifo); bfq_rq_enqueued(bfqd, bfqq, rq); + + return idle_timer_disabled; } static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, @@ -4226,7 +4281,11 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, { struct request_queue *q = hctx->queue; struct bfq_data *bfqd = q->elevator->elevator_data; +#ifdef CONFIG_BFQ_GROUP_IOSCHED struct bfq_queue *bfqq = RQ_BFQQ(rq); + bool idle_timer_disabled = false; + unsigned int cmd_flags; +#endif spin_lock_irq(&bfqd->lock); if (blk_mq_sched_try_insert_merge(q, rq)) { @@ -4245,13 +4304,17 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, else list_add_tail(&rq->queuelist, &bfqd->dispatch); } else { - __bfq_insert_request(bfqd, rq); +#ifdef CONFIG_BFQ_GROUP_IOSCHED + idle_timer_disabled = __bfq_insert_request(bfqd, rq); /* * Update bfqq, because, if a queue merge has occurred * in __bfq_insert_request, then rq has been * redirected into a new queue. */ bfqq = RQ_BFQQ(rq); +#else + __bfq_insert_request(bfqd, rq); +#endif if (rq_mergeable(rq)) { elv_rqhash_add(q, rq); @@ -4260,10 +4323,35 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, } } - if (bfqq) - bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, rq->cmd_flags); - +#ifdef CONFIG_BFQ_GROUP_IOSCHED + /* + * Cache cmd_flags before releasing scheduler lock, because rq + * may disappear afterwards (for example, because of a request + * merge). + */ + cmd_flags = rq->cmd_flags; +#endif spin_unlock_irq(&bfqd->lock); + +#ifdef CONFIG_BFQ_GROUP_IOSCHED + if (!bfqq) + return; + /* + * bfqq still exists, because it can disappear only after + * either it is merged with another queue, or the process it + * is associated with exits. But both actions must be taken by + * the same process currently executing this flow of + * instruction. + * + * In addition, the following queue lock guarantees that + * bfqq_group(bfqq) exists as well. + */ + spin_lock_irq(q->queue_lock); + bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, cmd_flags); + if (idle_timer_disabled) + bfqg_stats_update_idle_time(bfqq_group(bfqq)); + spin_unlock_irq(q->queue_lock); +#endif } static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx, diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 414ba68..e495d3f 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -843,7 +843,6 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served) st->vtime += bfq_delta(served, st->wsum); bfq_forget_idle(st); } - bfqg_stats_set_start_empty_time(bfqq_group(bfqq)); bfq_log_bfqq(bfqq->bfqd, bfqq, "bfqq_served %d secs", served); } From patchwork Mon Nov 13 06:34:10 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Valente X-Patchwork-Id: 118715 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp1445076qgn; Sun, 12 Nov 2017 22:34:57 -0800 (PST) X-Google-Smtp-Source: AGs4zMZKkbd87epuR5/ea97FTs2qbbAreG0YKyJMm9X/mKdXMwY7FX6ojsKJWPdvo05zG4a5Ktg/ X-Received: by 10.101.66.2 with SMTP id c2mr7656848pgq.403.1510554897208; Sun, 12 Nov 2017 22:34:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510554897; cv=none; d=google.com; s=arc-20160816; b=KhXcU4Oc2WGIyLbkhiaV96QsY1lqbK+nJPLd2Ajr8EK/08ADYmF/MH5jNc0A+eu74R qO1tvr9XuOp5hNgFED/rlRfvYQFCECVrpxdEun0OpH7lsXDDnwYEqs/BWA7W1MblYdH9 o+MltGeGxaijMCnNk8um8AYKkBxckXbAgdQaC1PuTsyQwK0GRAT4hBbNramoRTSmktib dvXl6om5PprhFjiqyG35rKCO7boGMKQeDciAZprOt7LAS8vYScUIf6RB9yhwjIdohJqE yNRzAf3s5k3ZUk8qvUmtbWglahrzqHbDdPFIMhcQMv2nxwrWO8DkpgbCEw0CuImjD0jR vrnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=lxZoiRxZ6pm3H5vGJo3kBiF8ufd7xXb8JH/GjjjzxdA=; b=voUnN2G1INToMbblZSN0aodqYk9LxagyrQyBYOox/AtxtyvB2cqPm7jdzEyXBGkfdk 9Z9M6QiWp2JVlHvwJGtsexPuepibGutgKoHfsFEQ2vKOZT6+InqpcRSdLzo3KlDSgs3V tJ82XSGs/jyYife6hKEuvepx7PcIvIUdoYkSLi0RBqtH5nWutfK+d4JeTE6ZFMhHIq/6 7Fiph6HFt5Gp4j3BwtiXIoAZOVYA5VNErvBGJUrumJ396ItxwKIc4dI0tEFEkraH/pUC SGiX8sMYY1vU/4ryCszkjqU4+9BCOd5uXezM9ugcT7rcWasWOSAWZqZC6CyiNgxWZttY EGnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Z82sr5Ih; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r20si3005054pfb.51.2017.11.12.22.34.56; Sun, 12 Nov 2017 22:34:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Z82sr5Ih; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752004AbdKMGez (ORCPT + 27 others); Mon, 13 Nov 2017 01:34:55 -0500 Received: from mail-wr0-f196.google.com ([209.85.128.196]:53230 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751803AbdKMGei (ORCPT ); Mon, 13 Nov 2017 01:34:38 -0500 Received: by mail-wr0-f196.google.com with SMTP id j23so13425969wra.9 for ; Sun, 12 Nov 2017 22:34:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lxZoiRxZ6pm3H5vGJo3kBiF8ufd7xXb8JH/GjjjzxdA=; b=Z82sr5IhO5fEFN+IWEApCM6/v6nI07qKMYn4Ym7frn5HwJ7WK/aR21g57ixjMNRmM7 +YjSIPNOzXU/JCQ8JT6fpoXRhuKrbDrLbWtz1kwW/k+W8e7kKKDwmFEZPZE3YcNiHkMz ELNPgBf4lcvbWvZyHvd7YZChN3P2vGFozZNzQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lxZoiRxZ6pm3H5vGJo3kBiF8ufd7xXb8JH/GjjjzxdA=; b=kMGMSeH2p4Dlp9xwlqB2BGPRcjDJ89caogP1jAA9j9yMVYpkPryI4LXZvTE36n/9Tj zEtc02AMAxLjaC2coxRPCW7lsGHogLNp1oYoJtAdkpNiWRL/RKjD/rxUr2gDwGh5frak WzEdsodWTBhgNrbBv7bVTrPRGmjkOO271u5A7E3CosIHrFPzwzJLYO/mOjqb4NV4fIRt gY2Bw7/0CJoftNC35GgGaMmsETsM9wy6CoRZ/PxupJbPis9T3L62mL+1tEXYGgh6dM5p WnfOjr7rpwnQDAHmWqMprkVyw4nYA3oBmUxYB1R4HTUdCkAnkZ5xyZVphv5OQFxAvOjj RfAw== X-Gm-Message-State: AJaThX5MpLg5RzUzAvSafwUBXPG8gdnbdu5UN1fdCuDhZklYvI83Qj71 VmYA/FnJXjl/SntdohxFvaqMSw== X-Received: by 10.223.136.164 with SMTP id f33mr5465715wrf.162.1510554876905; Sun, 12 Nov 2017 22:34:36 -0800 (PST) Received: from localhost.localdomain ([5.169.167.83]) by smtp.gmail.com with ESMTPSA id 10sm11158429wml.27.2017.11.12.22.34.34 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 12 Nov 2017 22:34:36 -0800 (PST) From: Paolo Valente To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, ulf.hansson@linaro.org, broonie@kernel.org, linus.walleij@linaro.org, lee.tibbert@gmail.com, oleksandr@natalenko.name, lucmiccio@gmail.com, bfq-iosched@googlegroups.com, Paolo Valente Subject: [PATCH BUGFIX/IMPROVEMENT 4/4] block, bfq: move debug blkio stats behind CONFIG_DEBUG_BLK_CGROUP Date: Mon, 13 Nov 2017 07:34:10 +0100 Message-Id: <20171113063410.3029-5-paolo.valente@linaro.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20171113063410.3029-1-paolo.valente@linaro.org> References: <20171113063410.3029-1-paolo.valente@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Luca Miccio BFQ currently creates, and updates, its own instance of the whole set of blkio statistics that cfq creates. Yet, from the comments of Tejun Heo in [1], it turned out that most of these statistics are meant/useful only for debugging. This commit makes BFQ create the latter, debugging statistics only if the option CONFIG_DEBUG_BLK_CGROUP is set. By doing so, this commit also enables BFQ to enjoy a high perfomance boost. The reason is that, if CONFIG_DEBUG_BLK_CGROUP is not set, then BFQ has to update far fewer statistics, and, in particular, not the heaviest to update. To give an idea of the benefits, if CONFIG_DEBUG_BLK_CGROUP is not set, then, on an Intel i7-4850HQ, and with 8 threads doing random I/O in parallel on null_blk (configured with 0 latency), the throughput of BFQ grows from 310 to 400 KIOPS (+30%). We have measured similar or even much higher boosts with other CPUs: e.g., +45% with an ARM CortexTM-A53 Octa-core. Our results have been obtained and can be reproduced very easily with the script in [1]. [1] https://www.spinics.net/lists/linux-block/msg18943.html Suggested-by: Tejun Heo Suggested-by: Ulf Hansson Tested-by: Lee Tibbert Tested-by: Oleksandr Natalenko Signed-off-by: Luca Miccio Signed-off-by: Paolo Valente --- Documentation/block/bfq-iosched.txt | 38 +++++++-- block/bfq-cgroup.c | 148 ++++++++++++++++++++---------------- block/bfq-iosched.c | 14 ++-- block/bfq-iosched.h | 4 +- 4 files changed, 125 insertions(+), 79 deletions(-) -- 2.10.0 diff --git a/Documentation/block/bfq-iosched.txt b/Documentation/block/bfq-iosched.txt index 7fad6c0..8d8d8f0 100644 --- a/Documentation/block/bfq-iosched.txt +++ b/Documentation/block/bfq-iosched.txt @@ -20,12 +20,22 @@ for that device, by setting low_latency to 0. See Section 3 for details on how to configure BFQ for the desired tradeoff between latency and throughput, or on how to maximize throughput. -BFQ has a non-null overhead, which limits the maximum IOPS that the -CPU can process for a device scheduled with BFQ. To give an idea of -the limits on slow or average CPUs, here are BFQ limits for three -different CPUs, on, respectively, an average laptop, an old desktop, -and a cheap embedded system, in case full hierarchical support is -enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set): +BFQ has a non-null overhead, which limits the maximum IOPS that a CPU +can process for a device scheduled with BFQ. To give an idea of the +limits on slow or average CPUs, here are, first, the limits of BFQ for +three different CPUs, on, respectively, an average laptop, an old +desktop, and a cheap embedded system, in case full hierarchical +support is enabled (i.e., CONFIG_BFQ_GROUP_IOSCHED is set), but +CONFIG_DEBUG_BLK_CGROUP is not set (Section 4-2): +- Intel i7-4850HQ: 400 KIOPS +- AMD A8-3850: 250 KIOPS +- ARM CortexTM-A53 Octa-core: 80 KIOPS + +If CONFIG_DEBUG_BLK_CGROUP is set (and of course full hierarchical +support is enabled), then the sustainable throughput with BFQ +decreases, because all blkio.bfq* statistics are created and updated +(Section 4-2). For BFQ, this leads to the following maximum +sustainable throughputs, on the same systems as above: - Intel i7-4850HQ: 310 KIOPS - AMD A8-3850: 200 KIOPS - ARM CortexTM-A53 Octa-core: 56 KIOPS @@ -505,6 +515,22 @@ BFQ-specific files is "blkio.bfq." or "io.bfq." For example, the group parameter to set the weight of a group with BFQ is blkio.bfq.weight or io.bfq.weight. +As for cgroups-v1 (blkio controller), the exact set of stat files +created, and kept up-to-date by bfq, depends on whether +CONFIG_DEBUG_BLK_CGROUP is set. If it is set, then bfq creates all +the stat files documented in +Documentation/cgroup-v1/blkio-controller.txt. If, instead, +CONFIG_DEBUG_BLK_CGROUP is not set, then bfq creates only the files +blkio.bfq.io_service_bytes +blkio.bfq.io_service_bytes_recursive +blkio.bfq.io_serviced +blkio.bfq.io_serviced_recursive + +The value of CONFIG_DEBUG_BLK_CGROUP greatly influences the maximum +throughput sustainable with bfq, because updating the blkio.bfq.* +stats is rather costly, especially for some of the stats enabled by +CONFIG_DEBUG_BLK_CGROUP. + Parameters to set ----------------- diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c index ceefb9a..da1525e 100644 --- a/block/bfq-cgroup.c +++ b/block/bfq-cgroup.c @@ -24,7 +24,7 @@ #include "bfq-iosched.h" -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) /* bfqg stats flags */ enum bfqg_stats_flags { @@ -152,6 +152,57 @@ void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) bfqg_stats_update_group_wait_time(stats); } +void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq, + unsigned int op) +{ + blkg_rwstat_add(&bfqg->stats.queued, op, 1); + bfqg_stats_end_empty_time(&bfqg->stats); + if (!(bfqq == ((struct bfq_data *)bfqg->bfqd)->in_service_queue)) + bfqg_stats_set_start_group_wait_time(bfqg, bfqq_group(bfqq)); +} + +void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) +{ + blkg_rwstat_add(&bfqg->stats.queued, op, -1); +} + +void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) +{ + blkg_rwstat_add(&bfqg->stats.merged, op, 1); +} + +void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time, + uint64_t io_start_time, unsigned int op) +{ + struct bfqg_stats *stats = &bfqg->stats; + unsigned long long now = sched_clock(); + + if (time_after64(now, io_start_time)) + blkg_rwstat_add(&stats->service_time, op, + now - io_start_time); + if (time_after64(io_start_time, start_time)) + blkg_rwstat_add(&stats->wait_time, op, + io_start_time - start_time); +} + +#else /* CONFIG_BFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */ + +void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq, + unsigned int op) { } +void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) { } +void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) { } +void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time, + uint64_t io_start_time, unsigned int op) { } +void bfqg_stats_update_dequeue(struct bfq_group *bfqg) { } +void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg) { } +void bfqg_stats_update_idle_time(struct bfq_group *bfqg) { } +void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg) { } +void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) { } + +#endif /* CONFIG_BFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */ + +#ifdef CONFIG_BFQ_GROUP_IOSCHED + /* * blk-cgroup policy-related handlers * The following functions help in converting between blk-cgroup @@ -229,42 +280,10 @@ void bfqg_and_blkg_put(struct bfq_group *bfqg) blkg_put(bfqg_to_blkg(bfqg)); } -void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq, - unsigned int op) -{ - blkg_rwstat_add(&bfqg->stats.queued, op, 1); - bfqg_stats_end_empty_time(&bfqg->stats); - if (!(bfqq == ((struct bfq_data *)bfqg->bfqd)->in_service_queue)) - bfqg_stats_set_start_group_wait_time(bfqg, bfqq_group(bfqq)); -} - -void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) -{ - blkg_rwstat_add(&bfqg->stats.queued, op, -1); -} - -void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) -{ - blkg_rwstat_add(&bfqg->stats.merged, op, 1); -} - -void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time, - uint64_t io_start_time, unsigned int op) -{ - struct bfqg_stats *stats = &bfqg->stats; - unsigned long long now = sched_clock(); - - if (time_after64(now, io_start_time)) - blkg_rwstat_add(&stats->service_time, op, - now - io_start_time); - if (time_after64(io_start_time, start_time)) - blkg_rwstat_add(&stats->wait_time, op, - io_start_time - start_time); -} - /* @stats = 0 */ static void bfqg_stats_reset(struct bfqg_stats *stats) { +#ifdef CONFIG_DEBUG_BLK_CGROUP /* queued stats shouldn't be cleared */ blkg_rwstat_reset(&stats->merged); blkg_rwstat_reset(&stats->service_time); @@ -276,6 +295,7 @@ static void bfqg_stats_reset(struct bfqg_stats *stats) blkg_stat_reset(&stats->group_wait_time); blkg_stat_reset(&stats->idle_time); blkg_stat_reset(&stats->empty_time); +#endif } /* @to += @from */ @@ -284,6 +304,7 @@ static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from) if (!to || !from) return; +#ifdef CONFIG_DEBUG_BLK_CGROUP /* queued stats shouldn't be cleared */ blkg_rwstat_add_aux(&to->merged, &from->merged); blkg_rwstat_add_aux(&to->service_time, &from->service_time); @@ -296,6 +317,7 @@ static void bfqg_stats_add_aux(struct bfqg_stats *to, struct bfqg_stats *from) blkg_stat_add_aux(&to->group_wait_time, &from->group_wait_time); blkg_stat_add_aux(&to->idle_time, &from->idle_time); blkg_stat_add_aux(&to->empty_time, &from->empty_time); +#endif } /* @@ -342,6 +364,7 @@ void bfq_init_entity(struct bfq_entity *entity, struct bfq_group *bfqg) static void bfqg_stats_exit(struct bfqg_stats *stats) { +#ifdef CONFIG_DEBUG_BLK_CGROUP blkg_rwstat_exit(&stats->merged); blkg_rwstat_exit(&stats->service_time); blkg_rwstat_exit(&stats->wait_time); @@ -353,10 +376,12 @@ static void bfqg_stats_exit(struct bfqg_stats *stats) blkg_stat_exit(&stats->group_wait_time); blkg_stat_exit(&stats->idle_time); blkg_stat_exit(&stats->empty_time); +#endif } static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp) { +#ifdef CONFIG_DEBUG_BLK_CGROUP if (blkg_rwstat_init(&stats->merged, gfp) || blkg_rwstat_init(&stats->service_time, gfp) || blkg_rwstat_init(&stats->wait_time, gfp) || @@ -371,6 +396,7 @@ static int bfqg_stats_init(struct bfqg_stats *stats, gfp_t gfp) bfqg_stats_exit(stats); return -ENOMEM; } +#endif return 0; } @@ -887,6 +913,7 @@ static ssize_t bfq_io_set_weight(struct kernfs_open_file *of, return bfq_io_set_weight_legacy(of_css(of), NULL, weight); } +#ifdef CONFIG_DEBUG_BLK_CGROUP static int bfqg_print_stat(struct seq_file *sf, void *v) { blkcg_print_blkgs(sf, css_to_blkcg(seq_css(sf)), blkg_prfill_stat, @@ -991,6 +1018,7 @@ static int bfqg_print_avg_queue_size(struct seq_file *sf, void *v) 0, false); return 0; } +#endif /* CONFIG_DEBUG_BLK_CGROUP */ struct bfq_group *bfq_create_group_hierarchy(struct bfq_data *bfqd, int node) { @@ -1029,15 +1057,6 @@ struct cftype bfq_blkcg_legacy_files[] = { /* statistics, covers only the tasks in the bfqg */ { - .name = "bfq.time", - .private = offsetof(struct bfq_group, stats.time), - .seq_show = bfqg_print_stat, - }, - { - .name = "bfq.sectors", - .seq_show = bfqg_print_stat_sectors, - }, - { .name = "bfq.io_service_bytes", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_bytes, @@ -1047,6 +1066,16 @@ struct cftype bfq_blkcg_legacy_files[] = { .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_ios, }, +#ifdef CONFIG_DEBUG_BLK_CGROUP + { + .name = "bfq.time", + .private = offsetof(struct bfq_group, stats.time), + .seq_show = bfqg_print_stat, + }, + { + .name = "bfq.sectors", + .seq_show = bfqg_print_stat_sectors, + }, { .name = "bfq.io_service_time", .private = offsetof(struct bfq_group, stats.service_time), @@ -1067,18 +1096,10 @@ struct cftype bfq_blkcg_legacy_files[] = { .private = offsetof(struct bfq_group, stats.queued), .seq_show = bfqg_print_rwstat, }, +#endif /* CONFIG_DEBUG_BLK_CGROUP */ /* the same statictics which cover the bfqg and its descendants */ { - .name = "bfq.time_recursive", - .private = offsetof(struct bfq_group, stats.time), - .seq_show = bfqg_print_stat_recursive, - }, - { - .name = "bfq.sectors_recursive", - .seq_show = bfqg_print_stat_sectors_recursive, - }, - { .name = "bfq.io_service_bytes_recursive", .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_bytes_recursive, @@ -1088,6 +1109,16 @@ struct cftype bfq_blkcg_legacy_files[] = { .private = (unsigned long)&blkcg_policy_bfq, .seq_show = blkg_print_stat_ios_recursive, }, +#ifdef CONFIG_DEBUG_BLK_CGROUP + { + .name = "bfq.time_recursive", + .private = offsetof(struct bfq_group, stats.time), + .seq_show = bfqg_print_stat_recursive, + }, + { + .name = "bfq.sectors_recursive", + .seq_show = bfqg_print_stat_sectors_recursive, + }, { .name = "bfq.io_service_time_recursive", .private = offsetof(struct bfq_group, stats.service_time), @@ -1132,6 +1163,7 @@ struct cftype bfq_blkcg_legacy_files[] = { .private = offsetof(struct bfq_group, stats.dequeue), .seq_show = bfqg_print_stat, }, +#endif /* CONFIG_DEBUG_BLK_CGROUP */ { } /* terminate */ }; @@ -1147,18 +1179,6 @@ struct cftype bfq_blkg_files[] = { #else /* CONFIG_BFQ_GROUP_IOSCHED */ -void bfqg_stats_update_io_add(struct bfq_group *bfqg, struct bfq_queue *bfqq, - unsigned int op) { } -void bfqg_stats_update_io_remove(struct bfq_group *bfqg, unsigned int op) { } -void bfqg_stats_update_io_merged(struct bfq_group *bfqg, unsigned int op) { } -void bfqg_stats_update_completion(struct bfq_group *bfqg, uint64_t start_time, - uint64_t io_start_time, unsigned int op) { } -void bfqg_stats_update_dequeue(struct bfq_group *bfqg) { } -void bfqg_stats_set_start_empty_time(struct bfq_group *bfqg) { } -void bfqg_stats_update_idle_time(struct bfq_group *bfqg) { } -void bfqg_stats_set_start_idle_time(struct bfq_group *bfqg) { } -void bfqg_stats_update_avg_queue_size(struct bfq_group *bfqg) { } - void bfq_bfqq_move(struct bfq_data *bfqd, struct bfq_queue *bfqq, struct bfq_group *bfqg) {} diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 69e05f8..bcb6d21 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -3693,14 +3693,14 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) { struct bfq_data *bfqd = hctx->queue->elevator->elevator_data; struct request *rq; -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) struct bfq_queue *in_serv_queue, *bfqq; bool waiting_rq, idle_timer_disabled; #endif spin_lock_irq(&bfqd->lock); -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) in_serv_queue = bfqd->in_service_queue; waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue); @@ -3714,7 +3714,7 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx) #endif spin_unlock_irq(&bfqd->lock); -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) bfqq = rq ? RQ_BFQQ(rq) : NULL; if (!idle_timer_disabled && !bfqq) return rq; @@ -4281,7 +4281,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, { struct request_queue *q = hctx->queue; struct bfq_data *bfqd = q->elevator->elevator_data; -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) struct bfq_queue *bfqq = RQ_BFQQ(rq); bool idle_timer_disabled = false; unsigned int cmd_flags; @@ -4304,7 +4304,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, else list_add_tail(&rq->queuelist, &bfqd->dispatch); } else { -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) idle_timer_disabled = __bfq_insert_request(bfqd, rq); /* * Update bfqq, because, if a queue merge has occurred @@ -4323,7 +4323,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, } } -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) /* * Cache cmd_flags before releasing scheduler lock, because rq * may disappear afterwards (for example, because of a request @@ -4333,7 +4333,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, #endif spin_unlock_irq(&bfqd->lock); -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) if (!bfqq) return; /* diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index ac0809c..91c4390 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -689,7 +689,7 @@ enum bfqq_expiration { }; struct bfqg_stats { -#ifdef CONFIG_BFQ_GROUP_IOSCHED +#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP) /* number of ios merged */ struct blkg_rwstat merged; /* total time spent on device in ns, may not be accurate w/ queueing */ @@ -717,7 +717,7 @@ struct bfqg_stats { uint64_t start_idle_time; uint64_t start_empty_time; uint16_t flags; -#endif /* CONFIG_BFQ_GROUP_IOSCHED */ +#endif /* CONFIG_BFQ_GROUP_IOSCHED && CONFIG_DEBUG_BLK_CGROUP */ }; #ifdef CONFIG_BFQ_GROUP_IOSCHED