From patchwork Mon Feb 8 15:04:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 379264 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87AC7C433DB for ; Mon, 8 Feb 2021 15:07:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 441B664ED4 for ; Mon, 8 Feb 2021 15:07:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231754AbhBHPGw (ORCPT ); Mon, 8 Feb 2021 10:06:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232753AbhBHPFJ (ORCPT ); Mon, 8 Feb 2021 10:05:09 -0500 Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B18AC061788 for ; Mon, 8 Feb 2021 07:04:29 -0800 (PST) Received: by mail-ed1-x52e.google.com with SMTP id c6so18527511ede.0 for ; Mon, 08 Feb 2021 07:04:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Od4EK2yVIIkurXebLowbqftTipY7/kgNJZJogZrgQvQ=; b=gkWsleFufxbYGB31Xpuu3Ew0nmDJL2dPT1dQQHXNVjmAkTGyrTUsWEHNSK5wfRM0Mh o3nJ5HrA+GWWMRG2sNpRNYzhlqrOgnFub2loCMYIxgIutz5mUKnWLbcn+AwQhQJO/oHz ibrkjJzcY+ljr4aT7cid5A5ds1gDEwlYvDnhWCnwG5elg8BtD/C6G5ng+UuoJz7pvysc BxkmcmqUbveDsv5YKnWlq0IxW5gGWJ4aF0/3a4YFzgJKI+BwhFsvZ4vFyq7IGe93FEby RVjkIotPjo5moHFMo49X4sUUmmsRtDCD16HRhNv7zrIVddbr7BfQbIToMk/F8uGBq9T5 W58A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Od4EK2yVIIkurXebLowbqftTipY7/kgNJZJogZrgQvQ=; b=Eq+hOWXYeePTdui0CnigGQZCfutXXNdR/p+ybD94CgW47JBv5kOXlhkZXdYU4kuJB8 sNEG1qF0n8gbp2793hmZZFVzCepzql1ENqM1DaeCSYo8WPkrrtYuOYzHjzrBEIg70JKy 6Ra1qcNvBV4MwPOOkwV0LrjFhMlThvQvWyVxwVNJqTDDHl01MuaoTN4zeod98O3FtaL3 Rp5n9mKwa6XIRecKdjbBhxMY1T5P5EUy276S3nD6b4Uj1N0pUtVlL+FOO8D9TiGgfhVo U/QBWEQlOjaX1fdvlHVCIaZhtwTbBZEBSODPwL8piWcqd8Xz/fg0nsbgNyLHqxqDvpGQ k84w== X-Gm-Message-State: AOAM533TcFy26iOUZv5awY1V2iThke5vpdf0rJo+6M+ChPCjXzCbsp4m 88t1RQcfQCW8rdg/ydNNZiAZLg== X-Google-Smtp-Source: ABdhPJwWlt27fiCNOFgDFZNVVmnLRQYaoTdbsi115S0r+YDG7ugxftMDPK/WsB4QYCu2e0rMPdHn6A== X-Received: by 2002:a05:6402:b27:: with SMTP id bo7mr17358079edb.372.1612796668173; Mon, 08 Feb 2021 07:04:28 -0800 (PST) Received: from jwang-Latitude-5491.fkb.profitbricks.net ([2001:16b8:4980:d900:bc0f:acd:c20a:c261]) by smtp.gmail.com with ESMTPSA id kb25sm4359106ejc.19.2021.02.08.07.04.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Feb 2021 07:04:27 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, axboe@kernel.dk, stable@vger.kernel.org Cc: zhengbin Subject: [stable-4.19 Resend 1/7] block: fix NULL pointer dereference in register_disk Date: Mon, 8 Feb 2021 16:04:20 +0100 Message-Id: <20210208150426.62755-2-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> References: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: zhengbin If __device_add_disk-->bdi_register_owner-->bdi_register--> bdi_register_va-->device_create_vargs fails, bdi->dev is still NULL, __device_add_disk-->register_disk will visit bdi->dev->kobj. This patch fixes that. Signed-off-by: zhengbin Signed-off-by: Jens Axboe (cherry picked from commit 4d7c1d3fd7c7eda7dea351f071945e843a46c145) Signed-off-by: Jack Wang --- block/genhd.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index 2b536ab570ac..6965dde96373 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -652,10 +652,12 @@ static void register_disk(struct device *parent, struct gendisk *disk) kobject_uevent(&part_to_dev(part)->kobj, KOBJ_ADD); disk_part_iter_exit(&piter); - err = sysfs_create_link(&ddev->kobj, - &disk->queue->backing_dev_info->dev->kobj, - "bdi"); - WARN_ON(err); + if (disk->queue->backing_dev_info->dev) { + err = sysfs_create_link(&ddev->kobj, + &disk->queue->backing_dev_info->dev->kobj, + "bdi"); + WARN_ON(err); + } } /** From patchwork Mon Feb 8 15:04:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 379263 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06747C433E0 for ; Mon, 8 Feb 2021 15:07:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B9F8264EDB for ; Mon, 8 Feb 2021 15:07:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232132AbhBHPHI (ORCPT ); Mon, 8 Feb 2021 10:07:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232733AbhBHPFL (ORCPT ); Mon, 8 Feb 2021 10:05:11 -0500 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9175AC06178C for ; Mon, 8 Feb 2021 07:04:31 -0800 (PST) Received: by mail-ed1-x534.google.com with SMTP id c6so18527704ede.0 for ; Mon, 08 Feb 2021 07:04:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BfO/909mdUaRpv6ekeQ5pLoyeo6G7O5g5UYJLsEOL80=; b=MmGtK51mZ/nb4m6J+tegtGj9dEOWyEsw4IM6ibVo9nz/IxSChfLMDRAudfCxGzPbym z/m21H2HxzTt7EV42jQnB0JS0Xz9D2a2frYh6lxYb0R8nEo1G8eEtGdslD0PFWNx4ilS MgqnnxC+/zuUgv5U2iI63iPTzW1la1s5KyqWuAB63nQIJ3gA94SKgR7zmOSAQvmXJfqx 3N9vYqQlXqschblGlUvYIlJIpL4HFOxB1Eh2pyxuw7imB2rrC86UAOFeS/UZFhmrFPMd cEbYPMY28VwyIjikW9mCuDCB8EdLScD/XVzRP1P8G19218/lqOxbYfzms1TuOOQmVc2D i2sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BfO/909mdUaRpv6ekeQ5pLoyeo6G7O5g5UYJLsEOL80=; b=NJI1UB5Rwzjwo41RQNthmeZ5MkVCQF2j0H4CTEJS8alfQL09Hlh4IznIxjBIOmSDnd 3kHfjfbDb6Hpw+UR+l01O79pfWL+FUC2VJQ2vA/OhzNMwoku3Zgibw9qJKZwcaYnUMLZ xhGKxzaP0ILvC/Ld1R/EJsGBMO2gGUfux4yHH6aCf/ZCc1tYLhJC7IIju1xRxErBVzym NCyC4m8nlXJcVdoA5vp/SaGLSfMrwwCRPCwiPnCon/1B/5cy2NzPovRVlH2IqYR6dXkJ WPHv0OAtqaHeqxZOGBrxnuRt5jjpshcbMYT5XGeLUgXHOGtKNY0H5iiMhcqUzxbcp5vM xSxg== X-Gm-Message-State: AOAM531KuFnUxHi54r+WIXPqPUoke6myJ0PCIWW4Uj/hVPX4rREDKflk iyAPLwLWewcCVBKh/vLPJzMTAw== X-Google-Smtp-Source: ABdhPJySqwps4jJIuSEUwGbgLGSxUHSe/fFXecOuSpzJWz5pxtiUcs15gqWQxwl4+QnF1eLfkHcEWw== X-Received: by 2002:a05:6402:2690:: with SMTP id w16mr17703671edd.304.1612796669170; Mon, 08 Feb 2021 07:04:29 -0800 (PST) Received: from jwang-Latitude-5491.fkb.profitbricks.net ([2001:16b8:4980:d900:bc0f:acd:c20a:c261]) by smtp.gmail.com with ESMTPSA id kb25sm4359106ejc.19.2021.02.08.07.04.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Feb 2021 07:04:28 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, axboe@kernel.dk, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Damien Le Moal Subject: [stable-4.19 Resend 2/7] block: don't hold q->sysfs_lock in elevator_init_mq Date: Mon, 8 Feb 2021 16:04:21 +0100 Message-Id: <20210208150426.62755-3-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> References: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei The original comment says: q->sysfs_lock must be held to provide mutual exclusion between elevator_switch() and here. Which is simply wrong. elevator_init_mq() is only called from blk_mq_init_allocated_queue, which is always called before the request queue is registered via blk_register_queue(), for dm-rq or normal rq based driver. However, queue's kobject is only exposed and added to sysfs in blk_register_queue(). So there isn't such race between elevator_switch() and elevator_init_mq(). So avoid to hold q->sysfs_lock in elevator_init_mq(). Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Cc: Damien Le Moal Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit c48dac137a62a5d6fa1ef3fa445cbd9c43655a76) Signed-off-by: Jack Wang --- block/elevator.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index fae58b2f906f..ddbcd36616a8 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -980,23 +980,19 @@ int elevator_init_mq(struct request_queue *q) if (q->nr_hw_queues != 1) return 0; - /* - * q->sysfs_lock must be held to provide mutual exclusion between - * elevator_switch() and here. - */ - mutex_lock(&q->sysfs_lock); + WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)); + if (unlikely(q->elevator)) - goto out_unlock; + goto out; e = elevator_get(q, "mq-deadline", false); if (!e) - goto out_unlock; + goto out; err = blk_mq_init_sched(q, e); if (err) elevator_put(e); -out_unlock: - mutex_unlock(&q->sysfs_lock); +out: return err; } From patchwork Mon Feb 8 15:04:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 379249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FAD9C433E9 for ; Mon, 8 Feb 2021 15:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B6E664EB6 for ; Mon, 8 Feb 2021 15:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231882AbhBHPJd (ORCPT ); Mon, 8 Feb 2021 10:09:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233056AbhBHPGk (ORCPT ); Mon, 8 Feb 2021 10:06:40 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAF90C061794 for ; Mon, 8 Feb 2021 07:04:32 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id df22so18529403edb.1 for ; Mon, 08 Feb 2021 07:04:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gw7klKD2RMcDufoZq/xD79IYKU/w+UW1t1ZRskK2cSw=; b=REtoqEvdi1bEY/DgIZ/t7Ft5fXF2dQ8pGquUWPDJ83Jli5NTr6uGQvnfQn2PsyZMlZ t3PmS2JIg2R6/BoMWbhu1IiwGjSst5kZ3huEB2GA0Jvoh2Z0vNOerhhT1Y1lyiqi3/x6 dXNa2vBjUUrIH8MgIIE2oBSKSnn/PwpSuU84jsE6dgW2/vBQRqLZ6s6RYvFRUuydNf78 TvI+rriQlQaUF7HETt/zr4VOedYeQy+FHTwoYRdV7z4AIWo6RYQ1f+HqsM460iTNgCFG 6eSZkOJJFCqrqOX+U71EEjZZ/vg53P2lGWChjM3Sd6blO80zapRJunfz0th1Low4yIky zN7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gw7klKD2RMcDufoZq/xD79IYKU/w+UW1t1ZRskK2cSw=; b=jdK67rb2NMZebf+CRZ/3zeXNhmjUPDdWpZFFy6pJFA9fM4MVMiSxIZ+ZXKJgCLBh0z E5BD+gZyZeQ4NUhZGQC51xKSeqU08cM3V1ER4LpkhPr+kY95cG5qjGwSIXFGr2lHJjLU k/i7qc5iF3iWRPdUn8tacozf7/iCXp3+04jz3kKGTspJrM8D2Ot+pzpx5EhxaroZ0xFx fHSqnEUhoHvGQDPCyUeOXEBGX7vNlngnRPHglosWovtwzoFoBySAhWSJaWNFGrdhDxKb D0OUxfMb66X3jMFZSyg6tLpy8TWsdwarkrTUQkRVWuXa32bEm+ARCAqNfv/E75DClHHL mOow== X-Gm-Message-State: AOAM531H9EDkw18McDyFVaxd1icc4nbXP1YCz7yTopCi5pew/nzczTuo 1iRfN/aNpn9Ua9q2BxIaGJaHQg== X-Google-Smtp-Source: ABdhPJwEbsjDMNFV1BWZKNm4r5qnl1xCf2S9u+Hf9Sl2SRJa3dx12ccM6ws47GtebxeJ8F1hzwIKxw== X-Received: by 2002:aa7:cac6:: with SMTP id l6mr18005573edt.357.1612796671557; Mon, 08 Feb 2021 07:04:31 -0800 (PST) Received: from jwang-Latitude-5491.fkb.profitbricks.net ([2001:16b8:4980:d900:bc0f:acd:c20a:c261]) by smtp.gmail.com with ESMTPSA id kb25sm4359106ejc.19.2021.02.08.07.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Feb 2021 07:04:31 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, axboe@kernel.dk, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche Subject: [stable-4.19 Resend 4/7] block: add helper for checking if queue is registered Date: Mon, 8 Feb 2021 16:04:23 +0100 Message-Id: <20210208150426.62755-5-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> References: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei There are 4 users which check if queue is registered, so add one helper to check it. Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit 58c898ba370e68d39470cd0d932b524682c1f9be) Signed-off-by: Jack Wang --- block/blk-sysfs.c | 4 ++-- block/blk-wbt.c | 2 +- block/elevator.c | 2 +- include/linux/blkdev.h | 1 + 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 8286640d4d66..0a7636d24563 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -896,7 +896,7 @@ int blk_register_queue(struct gendisk *disk) if (WARN_ON(!q)) return -ENXIO; - WARN_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags), + WARN_ONCE(blk_queue_registered(q), "%s is registering an already registered queue\n", kobject_name(&dev->kobj)); queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q); @@ -973,7 +973,7 @@ void blk_unregister_queue(struct gendisk *disk) return; /* Return early if disk->queue was never registered. */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return; /* diff --git a/block/blk-wbt.c b/block/blk-wbt.c index f1de8ba483a9..50f2abfa1a60 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -708,7 +708,7 @@ void wbt_enable_default(struct request_queue *q) return; /* Queue not registered? Maybe shutting down... */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return; if ((q->mq_ops && IS_ENABLED(CONFIG_BLK_WBT_MQ)) || diff --git a/block/elevator.c b/block/elevator.c index ddbcd36616a8..9bffe4558929 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -1083,7 +1083,7 @@ static int __elevator_change(struct request_queue *q, const char *name) struct elevator_type *e; /* Make sure queue is not in the middle of being removed */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return -ENOENT; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 745b2d0dcf78..3a2b34c2c82b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -743,6 +743,7 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q); #define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags) #define blk_queue_pm_only(q) atomic_read(&(q)->pm_only) #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) +#define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); From patchwork Mon Feb 8 15:04:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 379248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6F2AC433E6 for ; Mon, 8 Feb 2021 15:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 69FD364EFB for ; Mon, 8 Feb 2021 15:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232118AbhBHPJk (ORCPT ); Mon, 8 Feb 2021 10:09:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232226AbhBHPGm (ORCPT ); Mon, 8 Feb 2021 10:06:42 -0500 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C592AC061797 for ; Mon, 8 Feb 2021 07:04:33 -0800 (PST) Received: by mail-ed1-x52c.google.com with SMTP id y18so18448812edw.13 for ; Mon, 08 Feb 2021 07:04:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=btoceab6WAOABNrL1783NZ55mEzW4Zi4OKe5C+Dv/o0=; b=DqyNIunziOVh2um/6Iyu5wXk7O7ugdzStvWVjaE7pZIyMOKsXWSTEnVZwDxptZuBjg Y5PlW2j5lBgmCVmsHTmTxAJjaX6D3rKEMpK+ULDQaBeRE23RLSkOrUTF3xAICHDfS5JC v8OZB5+WRcwOs2M7LubYWgL7m3Rbfv7wbRrPuqmIIVHsy/KukzkQb58IEhRq785gaBxp Z3zmLq2xB7kusuZcTs6XnPrEdC7AtE8vcaZvXASs5Gd0PyaGWSYXcLybmkOXP+IcFhS+ 3mbmJTVRHzRKxyzKDReXJV1lz58NTRTwGmEQEeQ9EoxmJ6ghvtEdImOtU4Q4h6umOpW6 8wVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=btoceab6WAOABNrL1783NZ55mEzW4Zi4OKe5C+Dv/o0=; b=n6z77pbpoFi6W8Tp3E81Nmib3/vWEDjJxhdbNan3XY13RjV90t7PuwguSSDceB3zD6 Nr+rWt4anFIHPccRLsSiiYWrQRaT2CIWcez5MsDtn1485ENiQmT8FObhtZU7tPb+Tedq vLZXKvjlckiNtpZWmzyUg1O6WyKyq0jA7BROmGzqn4x7upQFa2SqiTI9ht51pEphm8qL jAUnrAve4sz4BAG2mbRjYRT+Djilxgj75DUIwcIgmBeZWUa6UZzYErhOEy4RpKF/zGOu X/jp8/ORrYNGXnAh3EzJmX0Ca0LVf26evcHDKiS6ydEfHI4OrCEn5q6QAG++bmHRU9bf 75sQ== X-Gm-Message-State: AOAM5333gAMlTH2ApvLXs9XuhNjXwaWVnjOo0iCet4f+Sx1Fev9QNFpv ZapEajHw4kAbKoYPrTPwBCFwJg== X-Google-Smtp-Source: ABdhPJxj5sVlOZbFCJ9Bc5EWEJ2Qn3CxkHupsT1ulMJanWU5TGOx8GRxWu9Pw7keK91rPaV+TmkpWA== X-Received: by 2002:a05:6402:3114:: with SMTP id dc20mr17573659edb.197.1612796672464; Mon, 08 Feb 2021 07:04:32 -0800 (PST) Received: from jwang-Latitude-5491.fkb.profitbricks.net ([2001:16b8:4980:d900:bc0f:acd:c20a:c261]) by smtp.gmail.com with ESMTPSA id kb25sm4359106ejc.19.2021.02.08.07.04.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Feb 2021 07:04:32 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, axboe@kernel.dk, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche Subject: [stable-4.19 Resend 5/7] block: split .sysfs_lock into two locks Date: Mon, 8 Feb 2021 16:04:24 +0100 Message-Id: <20210208150426.62755-6-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> References: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store path. Meantime, inside block's .show/.store callback, q->sysfs_lock is required. However, when mq & iosched kobjects are removed via blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held too. This way causes AB-BA lock because the kernfs built-in lock of 'kn-count' is required inside kobject_del() too, see the lockdep warning[1]. On the other hand, it isn't necessary to acquire q->sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue() because clearing REGISTERED flag prevents storing to 'queue/scheduler' from being happened. Also sysfs write(store) is exclusive, so no necessary to hold the lock for elv_unregister_queue() when it is called in switching elevator path. So split .sysfs_lock into two: one is still named as .sysfs_lock for covering sync .store, the other one is named as .sysfs_dir_lock for covering kobjects and related status change. sysfs itself can handle the race between add/remove kobjects and showing/storing attributes under kobjects. For switching scheduler via storing to 'queue/scheduler', we use the queue flag of QUEUE_FLAG_REGISTERED with .sysfs_lock for avoiding the race, then we can avoid to hold .sysfs_lock during removing/adding kobjects. [1] lockdep warning ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc3-00044-g73277fc75ea0 #1380 Not tainted ------------------------------------------------------ rmmod/777 is trying to acquire lock: 00000000ac50e981 (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x59/0x72 but task is already holding lock: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&q->sysfs_lock){+.+.}: __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __mutex_lock+0x14a/0xa9b blk_mq_hw_sysfs_show+0x63/0xb6 sysfs_kf_seq_show+0x11f/0x196 seq_read+0x2cd/0x5f2 vfs_read+0xc7/0x18c ksys_read+0xc4/0x13e do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#202){++++}: check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __kernfs_remove+0x237/0x40b kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->sysfs_lock); lock(kn->count#202); lock(&q->sysfs_lock); lock(kn->count#202); *** DEADLOCK *** 2 locks held by rmmod/777: #0: 00000000e69bd9de (&lock){+.+.}, at: null_exit+0x2e/0x95 [null_blk] #1: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b stack backtrace: CPU: 0 PID: 777 Comm: rmmod Not tainted 5.3.0-rc3-00044-g73277fc75ea0 #1380 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 Call Trace: dump_stack+0x9a/0xe6 check_noncircular+0x207/0x251 ? print_circular_bug+0x32a/0x32a ? find_usage_backwards+0x84/0xb0 check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 ? check_prev_add+0xc45/0xc45 ? mark_lock+0x11b/0x804 ? check_usage_forwards+0x1ca/0x1ca __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 ? kernfs_remove_by_name_ns+0x59/0x72 __kernfs_remove+0x237/0x40b ? kernfs_remove_by_name_ns+0x59/0x72 ? kernfs_next_descendant_post+0x7d/0x7d ? strlen+0x10/0x23 ? strcmp+0x22/0x44 kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa ? disk_events_poll_msecs_store+0x12b/0x12b ? check_flags+0x1ea/0x204 ? mark_held_locks+0x1f/0x7a null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 ? free_module+0x39f/0x39f ? blkcg_maybe_throttle_current+0x8a/0x718 ? rwlock_bug+0x62/0x62 ? __blkcg_punt_bio_submit+0xd0/0xd0 ? trace_hardirqs_on_thunk+0x1a/0x20 ? mark_held_locks+0x1f/0x7a ? do_syscall_64+0x4c/0x295 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb696cdbe6b Code: 73 01 c3 48 8b 0d 1d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 008 RSP: 002b:00007ffec9588788 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000559e589137c0 RCX: 00007fb696cdbe6b RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559e58913828 RBP: 0000000000000000 R08: 00007ffec9587701 R09: 0000000000000000 R10: 00007fb696d4eae0 R11: 0000000000000206 R12: 00007ffec95889b0 R13: 00007ffec95896b3 R14: 0000559e58913260 R15: 0000559e589137c0 Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (jwang:cherry picked from commit cecf5d87ff2035127bb5a9ee054d0023a4a7cad3, adjust ctx for 4,19) Signed-off-by: Jack Wang --- block/blk-core.c | 1 + block/blk-mq-sysfs.c | 12 ++++----- block/blk-sysfs.c | 44 +++++++++++++++++++------------ block/blk.h | 2 +- block/elevator.c | 59 +++++++++++++++++++++++++++++++++++------- include/linux/blkdev.h | 1 + 6 files changed, 86 insertions(+), 33 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ce3710404544..80f3e729fdd4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1059,6 +1059,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id, mutex_init(&q->blk_trace_mutex); #endif mutex_init(&q->sysfs_lock); + mutex_init(&q->sysfs_dir_lock); spin_lock_init(&q->__queue_lock); if (!q->mq_ops) diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 5006a0d00990..5e4b7ed1e897 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -264,7 +264,7 @@ void blk_mq_unregister_dev(struct device *dev, struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - lockdep_assert_held(&q->sysfs_lock); + lockdep_assert_held(&q->sysfs_dir_lock); queue_for_each_hw_ctx(q, hctx, i) blk_mq_unregister_hctx(hctx); @@ -312,7 +312,7 @@ int __blk_mq_register_dev(struct device *dev, struct request_queue *q) int ret, i; WARN_ON_ONCE(!q->kobj.parent); - lockdep_assert_held(&q->sysfs_lock); + lockdep_assert_held(&q->sysfs_dir_lock); ret = kobject_add(&q->mq_kobj, kobject_get(&dev->kobj), "%s", "mq"); if (ret < 0) @@ -358,7 +358,7 @@ void blk_mq_sysfs_unregister(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) goto unlock; @@ -366,7 +366,7 @@ void blk_mq_sysfs_unregister(struct request_queue *q) blk_mq_unregister_hctx(hctx); unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); } int blk_mq_sysfs_register(struct request_queue *q) @@ -374,7 +374,7 @@ int blk_mq_sysfs_register(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i, ret = 0; - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) goto unlock; @@ -385,7 +385,7 @@ int blk_mq_sysfs_register(struct request_queue *q) } unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); return ret; } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 0a7636d24563..b2208b69f04a 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -892,6 +892,7 @@ int blk_register_queue(struct gendisk *disk) int ret; struct device *dev = disk_to_dev(disk); struct request_queue *q = disk->queue; + bool has_elevator = false; if (WARN_ON(!q)) return -ENXIO; @@ -899,7 +900,6 @@ int blk_register_queue(struct gendisk *disk) WARN_ONCE(blk_queue_registered(q), "%s is registering an already registered queue\n", kobject_name(&dev->kobj)); - queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q); /* * SCSI probing may synchronously create and destroy a lot of @@ -920,8 +920,7 @@ int blk_register_queue(struct gendisk *disk) if (ret) return ret; - /* Prevent changes through sysfs until registration is completed. */ - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue"); if (ret < 0) { @@ -934,26 +933,36 @@ int blk_register_queue(struct gendisk *disk) blk_mq_debugfs_register(q); } - kobject_uevent(&q->kobj, KOBJ_ADD); - - wbt_enable_default(q); - - blk_throtl_register_queue(q); - + /* + * The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator + * switch won't happen at all. + */ if (q->request_fn || (q->mq_ops && q->elevator)) { - ret = elv_register_queue(q); + ret = elv_register_queue(q, false); if (ret) { - mutex_unlock(&q->sysfs_lock); - kobject_uevent(&q->kobj, KOBJ_REMOVE); + mutex_unlock(&q->sysfs_dir_lock); kobject_del(&q->kobj); blk_trace_remove_sysfs(dev); kobject_put(&dev->kobj); return ret; } + has_elevator = true; } + + mutex_lock(&q->sysfs_lock); + blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); + wbt_enable_default(q); + blk_throtl_register_queue(q); + + /* Now everything is ready and send out KOBJ_ADD uevent */ + kobject_uevent(&q->kobj, KOBJ_ADD); + if (has_elevator) + kobject_uevent(&q->elevator->kobj, KOBJ_ADD); + mutex_unlock(&q->sysfs_lock); + ret = 0; unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); return ret; } EXPORT_SYMBOL_GPL(blk_register_queue); @@ -968,6 +977,7 @@ EXPORT_SYMBOL_GPL(blk_register_queue); void blk_unregister_queue(struct gendisk *disk) { struct request_queue *q = disk->queue; + bool has_elevator; if (WARN_ON(!q)) return; @@ -982,25 +992,27 @@ void blk_unregister_queue(struct gendisk *disk) * concurrent elv_iosched_store() calls. */ mutex_lock(&q->sysfs_lock); - blk_queue_flag_clear(QUEUE_FLAG_REGISTERED, q); + has_elevator = !!q->elevator; + mutex_unlock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); /* * Remove the sysfs attributes before unregistering the queue data * structures that can be modified through sysfs. */ if (q->mq_ops) blk_mq_unregister_dev(disk_to_dev(disk), q); - mutex_unlock(&q->sysfs_lock); kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_del(&q->kobj); blk_trace_remove_sysfs(disk_to_dev(disk)); mutex_lock(&q->sysfs_lock); - if (q->request_fn || (q->mq_ops && q->elevator)) + if (q->request_fn || has_elevator) elv_unregister_queue(q); mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); kobject_put(&disk_to_dev(disk)->kobj); } diff --git a/block/blk.h b/block/blk.h index 1a5b67b57e6b..ae87e2a5f2bd 100644 --- a/block/blk.h +++ b/block/blk.h @@ -244,7 +244,7 @@ int elevator_init_mq(struct request_queue *q); int elevator_switch_mq(struct request_queue *q, struct elevator_type *new_e); void elevator_exit(struct request_queue *, struct elevator_queue *); -int elv_register_queue(struct request_queue *q); +int elv_register_queue(struct request_queue *q, bool uevent); void elv_unregister_queue(struct request_queue *q); struct hd_struct *__disk_get_part(struct gendisk *disk, int partno); diff --git a/block/elevator.c b/block/elevator.c index 9bffe4558929..2ff0859e8b35 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -833,13 +833,16 @@ static struct kobj_type elv_ktype = { .release = elevator_release, }; -int elv_register_queue(struct request_queue *q) +/* + * elv_register_queue is called from either blk_register_queue or + * elevator_switch, elevator switch is prevented from being happen + * in the two paths, so it is safe to not hold q->sysfs_lock. + */ +int elv_register_queue(struct request_queue *q, bool uevent) { struct elevator_queue *e = q->elevator; int error; - lockdep_assert_held(&q->sysfs_lock); - error = kobject_add(&e->kobj, &q->kobj, "%s", "iosched"); if (!error) { struct elv_fs_entry *attr = e->type->elevator_attrs; @@ -850,26 +853,36 @@ int elv_register_queue(struct request_queue *q) attr++; } } - kobject_uevent(&e->kobj, KOBJ_ADD); + if (uevent) + kobject_uevent(&e->kobj, KOBJ_ADD); + + mutex_lock(&q->sysfs_lock); e->registered = 1; if (!e->uses_mq && e->type->ops.sq.elevator_registered_fn) e->type->ops.sq.elevator_registered_fn(q); + mutex_unlock(&q->sysfs_lock); } return error; } +/* + * elv_unregister_queue is called from either blk_unregister_queue or + * elevator_switch, elevator switch is prevented from being happen + * in the two paths, so it is safe to not hold q->sysfs_lock. + */ void elv_unregister_queue(struct request_queue *q) { - lockdep_assert_held(&q->sysfs_lock); - if (q) { struct elevator_queue *e = q->elevator; kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); + + mutex_lock(&q->sysfs_lock); e->registered = 0; /* Re-enable throttling in case elevator disabled it */ wbt_enable_default(q); + mutex_unlock(&q->sysfs_lock); } } @@ -940,10 +953,32 @@ int elevator_switch_mq(struct request_queue *q, lockdep_assert_held(&q->sysfs_lock); if (q->elevator) { - if (q->elevator->registered) + if (q->elevator->registered) { + mutex_unlock(&q->sysfs_lock); + + /* + * Concurrent elevator switch can't happen becasue + * sysfs write is always exclusively on same file. + * + * Also the elevator queue won't be freed after + * sysfs_lock is released becasue kobject_del() in + * blk_unregister_queue() waits for completion of + * .store & .show on its attributes. + */ elv_unregister_queue(q); + + mutex_lock(&q->sysfs_lock); + } ioc_clear_queue(q); elevator_exit(q, q->elevator); + + /* + * sysfs_lock may be dropped, so re-check if queue is + * unregistered. If yes, don't switch to new elevator + * any more + */ + if (!blk_queue_registered(q)) + return 0; } ret = blk_mq_init_sched(q, new_e); @@ -951,7 +986,11 @@ int elevator_switch_mq(struct request_queue *q, goto out; if (new_e) { - ret = elv_register_queue(q); + mutex_unlock(&q->sysfs_lock); + + ret = elv_register_queue(q, true); + + mutex_lock(&q->sysfs_lock); if (ret) { elevator_exit(q, q->elevator); goto out; @@ -1047,7 +1086,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) if (err) goto fail_init; - err = elv_register_queue(q); + err = elv_register_queue(q, true); if (err) goto fail_register; @@ -1067,7 +1106,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) /* switch failed, restore and re-register old elevator */ if (old) { q->elevator = old; - elv_register_queue(q); + elv_register_queue(q, true); blk_queue_bypass_end(q); } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 3a2b34c2c82b..209ba8e7bd31 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -637,6 +637,7 @@ struct request_queue { struct delayed_work requeue_work; struct mutex sysfs_lock; + struct mutex sysfs_dir_lock; int bypass_depth; atomic_t mq_freeze_depth; From patchwork Mon Feb 8 15:04:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 379241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47067C433E9 for ; Mon, 8 Feb 2021 15:11:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E7F760238 for ; Mon, 8 Feb 2021 15:11:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233139AbhBHPLJ (ORCPT ); Mon, 8 Feb 2021 10:11:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232051AbhBHPGo (ORCPT ); Mon, 8 Feb 2021 10:06:44 -0500 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A02B3C0617A9 for ; Mon, 8 Feb 2021 07:04:35 -0800 (PST) Received: by mail-ed1-x52c.google.com with SMTP id df22so18529713edb.1 for ; Mon, 08 Feb 2021 07:04:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/Q85vcvTRwqOQln7ONod8UaPXYEVKrd8O7nB76luOHM=; b=fJnjrshE9XOztNM7o5MKTyN34XuNo2MVg9amiPnz60l10Pis9nb1AZqsELg8T72LjA 2/xNl9fYpZa5UKHXCdR5SfET58rtBUh0TAbutqsulVg7ghQ98XxnfFWIXCTo6gx41Hpl G1d5ggeF4zmkzYm2Z6sQ1w4IhCdR/L5fzdRrh30oHFh9wdfncklQihbwH73/KSta/99T MOOTPOv8iLAjJJYt4yvZfUt7FSDOC348+RB3REcUobrXdDypMl4xcYicWtjI4YssmQdE Z9Ub5GRXEnUiu35d61y2u3WzG8BSOwK1GEdpfjzH2tLvVVXyJHbT8el9dtIurFjfn0tn q6aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/Q85vcvTRwqOQln7ONod8UaPXYEVKrd8O7nB76luOHM=; b=Zu4unOmN8a291ZOdSByDcnIwd0yxAEP/UMUQuno/VLNh2rujzeH+fhGw1BorFdwEuf DrV1ZZvL4uP3M7UT4zocKo/tdHcH0jGOSSB1bY0uGjNYFSLKRsxbDsd3KeUPbg1+EGl2 gjag7okKgbCjY51Up4GIHtP8R9iJjWwXltc0X1IRKeKDa6ftxBFSVCIXX0AbdUzjZtrg Z7L/GSBkVaU0h7ROj4NRJqDWjHPO29kI6ALRt71TY1jn2wJDcffs5VLsl9MQFRrZ/vaJ 6134xQpBelWdHLvu5ryCKV4fmgw7UOdBQ1b66CKStBVEhl7kMoYdWT9a2255jKvpqMMe /W7w== X-Gm-Message-State: AOAM531WwYUdvxavQKh04UXIx0hGZEmKMV06RmqIoG/HoHIDZUCZ2PrU XvcbIofZkz+RGe6M0bqmyHr9DA== X-Google-Smtp-Source: ABdhPJyutxBZatPDwyoL9fU6mFQTvRlpS2QgLdffNFRcXx7U6CX+THfaQG4e3ndPsohc/+uTVI5jUQ== X-Received: by 2002:a05:6402:5107:: with SMTP id m7mr17469805edd.52.1612796674336; Mon, 08 Feb 2021 07:04:34 -0800 (PST) Received: from jwang-Latitude-5491.fkb.profitbricks.net ([2001:16b8:4980:d900:bc0f:acd:c20a:c261]) by smtp.gmail.com with ESMTPSA id kb25sm4359106ejc.19.2021.02.08.07.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Feb 2021 07:04:34 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, axboe@kernel.dk, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche Subject: [stable-4.19 Resend 7/7] block: don't release queue's sysfs lock during switching elevator Date: Mon, 8 Feb 2021 16:04:26 +0100 Message-Id: <20210208150426.62755-8-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> References: <20210208150426.62755-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to release & acquire sysfs_lock before registering/un-registering elevator queue during switching elevator for avoiding potential deadlock from showing & storing 'queue/iosched' attributes and removing elevator's kobject. Turns out there isn't such deadlock because 'q->sysfs_lock' isn't required in .show & .store of queue/iosched's attributes, and just elevator's sysfs lock is acquired in elv_iosched_store() and elv_iosched_show(). So it is safe to hold queue's sysfs lock when registering/un-registering elevator queue. The biggest issue is that commit cecf5d87ff20 assumes that concurrent write on 'queue/scheduler' can't happen. However, this assumption isn't true, because kernfs_fop_write() only guarantees that concurrent write aren't called on the same open file, but the write could be from different open on the file. So we can't release & re-acquire queue's sysfs lock during switching elevator, otherwise use-after-free on elevator could be triggered. Fixes the issue by not releasing queue's sysfs lock during switching elevator. Fixes: cecf5d87ff20 ("block: split .sysfs_lock into two locks") Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (jwang:cherry picked from commit b89f625e28d44552083f43752f62d8621ded0a04 adjust ctx for 4.19) Signed-off-by: Jack Wang --- block/blk-sysfs.c | 3 ++- block/elevator.c | 31 +------------------------------ 2 files changed, 3 insertions(+), 31 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 899987152701..07494deb1a26 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -933,6 +933,7 @@ int blk_register_queue(struct gendisk *disk) blk_mq_debugfs_register(q); } + mutex_lock(&q->sysfs_lock); /* * The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator * switch won't happen at all. @@ -940,6 +941,7 @@ int blk_register_queue(struct gendisk *disk) if (q->request_fn || (q->mq_ops && q->elevator)) { ret = elv_register_queue(q, false); if (ret) { + mutex_unlock(&q->sysfs_lock); mutex_unlock(&q->sysfs_dir_lock); kobject_del(&q->kobj); blk_trace_remove_sysfs(dev); @@ -949,7 +951,6 @@ int blk_register_queue(struct gendisk *disk) has_elevator = true; } - mutex_lock(&q->sysfs_lock); blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); wbt_enable_default(q); blk_throtl_register_queue(q); diff --git a/block/elevator.c b/block/elevator.c index 2ff0859e8b35..5b51bc5fad9f 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -856,11 +856,9 @@ int elv_register_queue(struct request_queue *q, bool uevent) if (uevent) kobject_uevent(&e->kobj, KOBJ_ADD); - mutex_lock(&q->sysfs_lock); e->registered = 1; if (!e->uses_mq && e->type->ops.sq.elevator_registered_fn) e->type->ops.sq.elevator_registered_fn(q); - mutex_unlock(&q->sysfs_lock); } return error; } @@ -878,11 +876,9 @@ void elv_unregister_queue(struct request_queue *q) kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); - mutex_lock(&q->sysfs_lock); e->registered = 0; /* Re-enable throttling in case elevator disabled it */ wbt_enable_default(q); - mutex_unlock(&q->sysfs_lock); } } @@ -953,32 +949,11 @@ int elevator_switch_mq(struct request_queue *q, lockdep_assert_held(&q->sysfs_lock); if (q->elevator) { - if (q->elevator->registered) { - mutex_unlock(&q->sysfs_lock); - - /* - * Concurrent elevator switch can't happen becasue - * sysfs write is always exclusively on same file. - * - * Also the elevator queue won't be freed after - * sysfs_lock is released becasue kobject_del() in - * blk_unregister_queue() waits for completion of - * .store & .show on its attributes. - */ + if (q->elevator->registered) elv_unregister_queue(q); - mutex_lock(&q->sysfs_lock); - } ioc_clear_queue(q); elevator_exit(q, q->elevator); - - /* - * sysfs_lock may be dropped, so re-check if queue is - * unregistered. If yes, don't switch to new elevator - * any more - */ - if (!blk_queue_registered(q)) - return 0; } ret = blk_mq_init_sched(q, new_e); @@ -986,11 +961,7 @@ int elevator_switch_mq(struct request_queue *q, goto out; if (new_e) { - mutex_unlock(&q->sysfs_lock); - ret = elv_register_queue(q, true); - - mutex_lock(&q->sysfs_lock); if (ret) { elevator_exit(q, q->elevator); goto out;