From patchwork Wed Feb 3 13:20:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 375963 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9667FC433E9 for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4E7BD64E38 for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231493AbhBCNVH (ORCPT ); Wed, 3 Feb 2021 08:21:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231488AbhBCNVF (ORCPT ); Wed, 3 Feb 2021 08:21:05 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78DA1C0613D6 for ; Wed, 3 Feb 2021 05:20:25 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id c4so21539027wru.9 for ; Wed, 03 Feb 2021 05:20:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Od4EK2yVIIkurXebLowbqftTipY7/kgNJZJogZrgQvQ=; b=Sjf5GlQNY1QkFhYOckMecV3FmqsRIYAFGfeVgRKIG/OoivLdo1rCyF07/AJmmohZY+ MKbHyrDktKZlxo0ycYetMdy4/Aj6zeaOv+NkJJj8XfWy0CUP4rNTHbVUouX/xeIQx1lk 9A0MScMEbgJUrsgGM5uNoRWxqdQJ/G7jAhIMdyd5LNQ9HcO8EQB0kmm6Sc4H370RXw7r tx5vMtuT2bgrP4aZTppQ5/xwHOp62yihD8gh/sHOaxfcuclinm7jJ6N37xn7tXWC81cn Cajx7Xx6q6i9q0ocFoN/mbxtu/jMYukCuz66EHwTHNzujfyG7NeyvyAZWZOQIS/DO+ie /5RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Od4EK2yVIIkurXebLowbqftTipY7/kgNJZJogZrgQvQ=; b=eWcx8n3MpRl8UqMSYKO+WnZN2I0LwCpItHMQfIFgHdBYNPPK0unYqkSMTNoa+FL8x9 LcgGH2sxZQ/lNZgPYwX+ROQaJlU8zCny5xGUgY3P030MOTT82MUfx6YR+hnQU8zbo2ew m8UtAfY+ZKS7CNtXMKam0+ctiBE4LZqTfBpMj/VrtULaky++Fs1EuKtlc0nnfwYzq8Gf vPNgMaWix2N1rwyZPGJkiTFCvzjAk1uX5DR3aYs/rU5j/2EJhO/bVTr6+QRngReYkvxz 9CwtgQsKNw+W+x7Oc7QZz9d6qVi0w1ktWyY39yXqkKXZ3BJ6OrX6FLr/J2UepxALO0sY G3uQ== X-Gm-Message-State: AOAM531nXvDjLIYqcA/5S6hR0TpWeqHpB2e/J2FpsX0iO7kCrrNF+TZ3 E1pFufHvivXX2R+lvpsB9CNZ0up6bMQr7Q== X-Google-Smtp-Source: ABdhPJxEv8Hb+n7xGjimcxlgQepVL4lqiOf2Hz6h/HZ988L7BcMKjam5+5nDlfdlarLuhAhjj/I2Jw== X-Received: by 2002:adf:dd0d:: with SMTP id a13mr3514589wrm.143.1612358424285; Wed, 03 Feb 2021 05:20:24 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:23 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: zhengbin , Jens Axboe Subject: [stable-4.19 1/8] block: fix NULL pointer dereference in register_disk Date: Wed, 3 Feb 2021 14:20:15 +0100 Message-Id: <20210203132022.92406-2-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: zhengbin If __device_add_disk-->bdi_register_owner-->bdi_register--> bdi_register_va-->device_create_vargs fails, bdi->dev is still NULL, __device_add_disk-->register_disk will visit bdi->dev->kobj. This patch fixes that. Signed-off-by: zhengbin Signed-off-by: Jens Axboe (cherry picked from commit 4d7c1d3fd7c7eda7dea351f071945e843a46c145) Signed-off-by: Jack Wang --- block/genhd.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/block/genhd.c b/block/genhd.c index 2b536ab570ac..6965dde96373 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -652,10 +652,12 @@ static void register_disk(struct device *parent, struct gendisk *disk) kobject_uevent(&part_to_dev(part)->kobj, KOBJ_ADD); disk_part_iter_exit(&piter); - err = sysfs_create_link(&ddev->kobj, - &disk->queue->backing_dev_info->dev->kobj, - "bdi"); - WARN_ON(err); + if (disk->queue->backing_dev_info->dev) { + err = sysfs_create_link(&ddev->kobj, + &disk->queue->backing_dev_info->dev->kobj, + "bdi"); + WARN_ON(err); + } } /** From patchwork Wed Feb 3 13:20:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 376871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0B48C43381 for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76DCB64F4C for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231488AbhBCNVI (ORCPT ); Wed, 3 Feb 2021 08:21:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231535AbhBCNVH (ORCPT ); Wed, 3 Feb 2021 08:21:07 -0500 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96BDFC0613ED for ; Wed, 3 Feb 2021 05:20:26 -0800 (PST) Received: by mail-wr1-x42e.google.com with SMTP id c12so24266105wrc.7 for ; Wed, 03 Feb 2021 05:20:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BfO/909mdUaRpv6ekeQ5pLoyeo6G7O5g5UYJLsEOL80=; b=gsVKCsbnjeaznli0xpdvVmzAqPGuh4XLY8ewqDtW9H0ByfqTqUzh94eZc1D8KKY1hq nX1zLzSXvYGlO1euxs91gXIFAyrpG82Ng+uSgqKKYeAMyrb65lk/IhHq4fMt2N5BOUiH La9YTIjrJEUf0dG0qVXKD2SzSdJ+h+PYZhlciS0a0yhc27Vko3fmOY3HFHrfoa90khPG qgvyraEXz7tJwlOD2sDOTQHl9VldLPQkNWwE1wbaiDLNRAsHFWBN6xKSnek5OZDtew/N b+QlcNTvcAU1sGbwt4jx5aovfvSMQ54yUgTAGO6dbbkEHlxFilUcFEU60Jns5irp2IMB DxwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BfO/909mdUaRpv6ekeQ5pLoyeo6G7O5g5UYJLsEOL80=; b=T/l+rzA7CA4sP1dU9KPPGwnB828kRQAJetYYwNk32kcO1Sq/aYwD5CRN+tiQ24iAKg tCJhkHOdEbOoujHZFtc1usYihzTkRhen4QxwjNVlWVfTexvq9Sx7T24/z5SK13+k0FE/ FsAg/QwVd8zJ5jQVDjDHGDzpIjsOr4IT4YToeLPx7MT5pNfUTnTO18ltHjaqJW7iipbu Rpwlst3/TAFS3SFWnaiI7IUNeqCe6CTwi+yT1E/BXlFbbkcy52JP9r1XeemBuoVzfWSG wb6D9yy9SZKLwj7Te4CDOIWPQrT+qoNFOcytn0y6FA7biecUXRlJKX4x3INy9AGuTxuZ Sqqw== X-Gm-Message-State: AOAM5303WZpn6nZ6JAI71h/XwGYK7dB/1RKR0ixRuTZOPW5XaRqGevwZ I4Fn6bhMN7o0IUjEvZ5GDmdYYQ== X-Google-Smtp-Source: ABdhPJy0UOU3brOw2FrfNueaOntHlnbvuoV2ZynvOsSmdEWrTFHUMK2jqRW0vTVx8x4J4DzY39nPuQ== X-Received: by 2002:adf:c64f:: with SMTP id u15mr3391897wrg.270.1612358425343; Wed, 03 Feb 2021 05:20:25 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:24 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Damien Le Moal , Jens Axboe Subject: [stable-4.19 2/8] block: don't hold q->sysfs_lock in elevator_init_mq Date: Wed, 3 Feb 2021 14:20:16 +0100 Message-Id: <20210203132022.92406-3-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei The original comment says: q->sysfs_lock must be held to provide mutual exclusion between elevator_switch() and here. Which is simply wrong. elevator_init_mq() is only called from blk_mq_init_allocated_queue, which is always called before the request queue is registered via blk_register_queue(), for dm-rq or normal rq based driver. However, queue's kobject is only exposed and added to sysfs in blk_register_queue(). So there isn't such race between elevator_switch() and elevator_init_mq(). So avoid to hold q->sysfs_lock in elevator_init_mq(). Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Cc: Damien Le Moal Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit c48dac137a62a5d6fa1ef3fa445cbd9c43655a76) Signed-off-by: Jack Wang --- block/elevator.c | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/block/elevator.c b/block/elevator.c index fae58b2f906f..ddbcd36616a8 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -980,23 +980,19 @@ int elevator_init_mq(struct request_queue *q) if (q->nr_hw_queues != 1) return 0; - /* - * q->sysfs_lock must be held to provide mutual exclusion between - * elevator_switch() and here. - */ - mutex_lock(&q->sysfs_lock); + WARN_ON_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)); + if (unlikely(q->elevator)) - goto out_unlock; + goto out; e = elevator_get(q, "mq-deadline", false); if (!e) - goto out_unlock; + goto out; err = blk_mq_init_sched(q, e); if (err) elevator_put(e); -out_unlock: - mutex_unlock(&q->sysfs_lock); +out: return err; } From patchwork Wed Feb 3 13:20:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 376870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, URIBL_RED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF1E0C4332D for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A14BD64DC4 for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231543AbhBCNVJ (ORCPT ); Wed, 3 Feb 2021 08:21:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231479AbhBCNVI (ORCPT ); Wed, 3 Feb 2021 08:21:08 -0500 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89864C061786 for ; Wed, 3 Feb 2021 05:20:27 -0800 (PST) Received: by mail-wr1-x42c.google.com with SMTP id l12so24268232wry.2 for ; Wed, 03 Feb 2021 05:20:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZUngNzKrNvH0ej3h0vECgxJ0aSlmB5p4rWMKzMxwWgM=; b=DHW8VtW+YB8CYLABpPzQqAjnPuGzoxuN0vng1Nyc1kp3NRi5ZS3PSHWkI5AF3QHCbj 924IBiJJoNRrGA21/iO18HB0Nwtf8y1osNo4obrdJ24QaOsEOLwmQc/vcFjHAYJ/cs1S LDVyaIKC3qwtY7TprmcnIfe3zSLwI/mPu1VSjdBYqp9Cz5def0HttNEy7HBnLizZNBax YuSStfbv6ewb7KO4w/Em16OO31N8RP97i8o9sM76hoU6BmxV4gRV9Yg4aT9WvJSDZUhF +qtlFW2cvA41xRYiCwBHa/pdXBkwVhuaKXA3AQZyfhU5h5epDPjDd9afukcYumwBqBLS OOAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZUngNzKrNvH0ej3h0vECgxJ0aSlmB5p4rWMKzMxwWgM=; b=qckXkBXok3RdSpN6ACPp8A3bHQUEr+wTuLH9XOR1StiG5DcZ9oRZzOlPxI0hnv0xHc XLoKpepQ7V0Ha90AZ/VzDYhq4XAFXJca/PsZ69T8HMyMH9ECstM85gAYX8fIjYhScRs0 E0mqWxPtY5aW5XLiOkjI3P/Cs8Gsug/tzC+Kw8jqHp7YVv26XxN5awkRPBMlfwXxPylJ sT/rC6b+fv+cndoMmHSs7MPrxJe2i9Rd9TRTeqE3s//W8St8cFDDUE21XoR1T87M/J/z dphYfvjGBKHsp54+2/QJ5u5wGI3u+QkFfNr5gA+cv+yXLdppompiF+6PhIVhjmskayEV K+XA== X-Gm-Message-State: AOAM532+QdFAoLaPI0mPMzMfpd0ZHGoJrpS4ym+UMuhIqPpUBeSBz3UG HfqcGqT6y3fISR1jkP4a9AaZog== X-Google-Smtp-Source: ABdhPJyI2oCM+pH2OzKhCAVDEHx2xqUF0IIt/Nx30XHfl7WY2U+Edy7239IrGOqCzEnmy29oNGb4fQ== X-Received: by 2002:a5d:4a09:: with SMTP id m9mr3561871wrq.122.1612358426361; Wed, 03 Feb 2021 05:20:26 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:25 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Jens Axboe Subject: [stable-4.19 3/8] blk-mq: don't hold q->sysfs_lock in blk_mq_map_swqueue Date: Wed, 3 Feb 2021 14:20:17 +0100 Message-Id: <20210203132022.92406-4-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei blk_mq_map_swqueue() is called from blk_mq_init_allocated_queue() and blk_mq_update_nr_hw_queues(). For the former caller, the kobject isn't exposed to userspace yet. For the latter caller, hctx sysfs entries and debugfs are un-registered before updating nr_hw_queues. On the other hand, commit 2f8f1336a48b ("blk-mq: always free hctx after request queue is freed") moves freeing hctx into queue's release handler, so there won't be race with queue release path too. So don't hold q->sysfs_lock in blk_mq_map_swqueue(). Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit c6ba933358f0d7a6a042b894dba20cc70396a6d3) Signed-off-by: Jack Wang --- block/blk-mq.c | 7 ------- 1 file changed, 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 0df43515ff94..195526b93895 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2324,11 +2324,6 @@ static void blk_mq_map_swqueue(struct request_queue *q) struct blk_mq_ctx *ctx; struct blk_mq_tag_set *set = q->tag_set; - /* - * Avoid others reading imcomplete hctx->cpumask through sysfs - */ - mutex_lock(&q->sysfs_lock); - queue_for_each_hw_ctx(q, hctx, i) { cpumask_clear(hctx->cpumask); hctx->nr_ctx = 0; @@ -2362,8 +2357,6 @@ static void blk_mq_map_swqueue(struct request_queue *q) hctx->ctxs[hctx->nr_ctx++] = ctx; } - mutex_unlock(&q->sysfs_lock); - queue_for_each_hw_ctx(q, hctx, i) { /* * If no software queues are mapped to this hardware queue, From patchwork Wed Feb 3 13:20:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 375962 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, URIBL_RED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 300ECC43331 for ; Wed, 3 Feb 2021 13:21:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 036F164F55 for ; Wed, 3 Feb 2021 13:21:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231629AbhBCNVT (ORCPT ); Wed, 3 Feb 2021 08:21:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231551AbhBCNVO (ORCPT ); Wed, 3 Feb 2021 08:21:14 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC554C061788 for ; Wed, 3 Feb 2021 05:20:28 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id p15so24192591wrq.8 for ; Wed, 03 Feb 2021 05:20:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gw7klKD2RMcDufoZq/xD79IYKU/w+UW1t1ZRskK2cSw=; b=Xo3T/6OzcmD0Mn8NxHLMT1cZZKeZDgrb4AfPb8tvGfYHvD4k5ZYQgRqKRctBUNHrnN sHcuF56MfnhOLXOJZJNTTPkTlXrD/Jo1/1fqFJ7/23f0OMOP+omKLmI+zEpbznQ8cPjH MZgqh0MBwxqLukvyS5+HpOrP26E7y2fplD8/1yQqIplpJ+09ZXsG/iT/v6hZ0We8UowC JTDGRFjBF4tMvoKdgZk+k92Wdr5V4OlYXkvTdAzkSYfeyNtTH/eKrYYfBH92LWwWGLsz 5OX4NL1msrXtTjEZ38Y6mx1qQzwSvXygL+joHmUTTVE9fAipqFGK3WKoegxuWJfcwS/c BmMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gw7klKD2RMcDufoZq/xD79IYKU/w+UW1t1ZRskK2cSw=; b=rOjJggQLSBSuarV8lESijjwR400G69rwG0GMTQFO9CAqPVnNke71dARjzaFVragM/s XbNsdW95swtWWiaMqQeSpu/3Ccb0j4G8QP3foaAiVJoOrm6ooQY0qT/3apWRRw77b4DB dDUAjWRy3bTclykSPbRm5gKvAg+3BTLr2iXg0CQ3tEhfy8wFfV4BU6OaFDOcI5D9eM5+ g2AqkctsAPMTa7J4BZ+g+DuyvrOWXLhr6pNaezLKW+aR4hoXiNu3iE8GW/29HWswBf80 xoTpcf2s7PjlCSl15ErRD2X7vrsjGjbbP7bzbA14T92kVMWdsTDsk6UG9SvPJpUoBkUY DYGA== X-Gm-Message-State: AOAM532WF5HHfrFdshcu3n8AdutBbzntGQjuSmG9s0NoxZxJ6ZlYJ4pU lZIFMrqNqDzlwXmRsPQhJpEX6g== X-Google-Smtp-Source: ABdhPJxuC+OxHUcITZry6CqRC0oVUyuilBYyozCM9c6bPp70yMMFDqQM9Muz+KO5/1Oi9ZUtOqkoCA== X-Received: by 2002:adf:e3c2:: with SMTP id k2mr3488677wrm.337.1612358427445; Wed, 03 Feb 2021 05:20:27 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:27 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Jens Axboe Subject: [stable-4.19 4/8] block: add helper for checking if queue is registered Date: Wed, 3 Feb 2021 14:20:18 +0100 Message-Id: <20210203132022.92406-5-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei There are 4 users which check if queue is registered, so add one helper to check it. Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit 58c898ba370e68d39470cd0d932b524682c1f9be) Signed-off-by: Jack Wang --- block/blk-sysfs.c | 4 ++-- block/blk-wbt.c | 2 +- block/elevator.c | 2 +- include/linux/blkdev.h | 1 + 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 8286640d4d66..0a7636d24563 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -896,7 +896,7 @@ int blk_register_queue(struct gendisk *disk) if (WARN_ON(!q)) return -ENXIO; - WARN_ONCE(test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags), + WARN_ONCE(blk_queue_registered(q), "%s is registering an already registered queue\n", kobject_name(&dev->kobj)); queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q); @@ -973,7 +973,7 @@ void blk_unregister_queue(struct gendisk *disk) return; /* Return early if disk->queue was never registered. */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return; /* diff --git a/block/blk-wbt.c b/block/blk-wbt.c index f1de8ba483a9..50f2abfa1a60 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -708,7 +708,7 @@ void wbt_enable_default(struct request_queue *q) return; /* Queue not registered? Maybe shutting down... */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return; if ((q->mq_ops && IS_ENABLED(CONFIG_BLK_WBT_MQ)) || diff --git a/block/elevator.c b/block/elevator.c index ddbcd36616a8..9bffe4558929 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -1083,7 +1083,7 @@ static int __elevator_change(struct request_queue *q, const char *name) struct elevator_type *e; /* Make sure queue is not in the middle of being removed */ - if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags)) + if (!blk_queue_registered(q)) return -ENOENT; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 745b2d0dcf78..3a2b34c2c82b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -743,6 +743,7 @@ bool blk_queue_flag_test_and_clear(unsigned int flag, struct request_queue *q); #define blk_queue_quiesced(q) test_bit(QUEUE_FLAG_QUIESCED, &(q)->queue_flags) #define blk_queue_pm_only(q) atomic_read(&(q)->pm_only) #define blk_queue_fua(q) test_bit(QUEUE_FLAG_FUA, &(q)->queue_flags) +#define blk_queue_registered(q) test_bit(QUEUE_FLAG_REGISTERED, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); From patchwork Wed Feb 3 13:20:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 375961 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C755C433E0 for ; Wed, 3 Feb 2021 13:21:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 111AF64F55 for ; Wed, 3 Feb 2021 13:21:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230319AbhBCNVs (ORCPT ); Wed, 3 Feb 2021 08:21:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229774AbhBCNVq (ORCPT ); Wed, 3 Feb 2021 08:21:46 -0500 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D46B6C06178A for ; Wed, 3 Feb 2021 05:20:29 -0800 (PST) Received: by mail-wm1-x32e.google.com with SMTP id m1so5210376wml.2 for ; Wed, 03 Feb 2021 05:20:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=btoceab6WAOABNrL1783NZ55mEzW4Zi4OKe5C+Dv/o0=; b=ZD+O/Kr25sNbmDLwe+A6aC9qIDBFhASh7W17pI+dCoLs8tD2qdIEwVO9Y3kHBSCTHS vajQyxJ7+bbdgQI9bBRjDyKlM2J0jYq9Kglt5sd9aSCYoyaMUxSW1aephs+Ww/vl9Ovi oghu798TMb6YdS0IDGsJpFl84z1WXhgWz9sJDbESdN3W1FRYbEzti/wGrswBxt/AZ4SN P8KdtX/MtfD3dOrujqxCIQO6HBLo8PUjYi+ivdOJ7QURB8UYp6U51Ljq+N+ZOKBJSh3K ejdd1jVLmuZkEO2H/ORdnCJml/13YxbthCD8jdzz8uhow0FYgw8yqhDi1I542/VEMMUV qNEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=btoceab6WAOABNrL1783NZ55mEzW4Zi4OKe5C+Dv/o0=; b=GEl12BTq8WpyyZwFoHXMe3Qrm5lAYJyTM3GqpO/bNVC1gS2H3NxAcakYUsl8ZMEiur AOP3tN0jz3S4yVJFIIi/TgSwtLAnYTeOXqzxpy98gWHVs9LZQNV/twwDrIwuvRIQqWZr NlhazZg+UBBtJXUuT0v3eh4LyudYuNfUKFUme9lnQkvsyI/qnUPi/14GId0wrKVn5Gux h4MXBV2bqI2Shx0OaXv7J8HBgkPfcQSvVnnOo3lBw5Bkagjo3+EfCNAJzjzn1NoMyCYp M+9bZMYuK0eY9n08Yc2py2z+OZK4EhcpXiE9ATJw5tnXc0WNtL8LvcdKsBjWQvJMhjSV uOCQ== X-Gm-Message-State: AOAM533QoOzkRFQP7JVrQqFPpWu27wl5Bx7qSVX7RZD55YwhcInEJeBl O4hQT3iSU8ilp6BRn/+SblykyQ== X-Google-Smtp-Source: ABdhPJw8fKrR2EHttWaHVKOwx9L1BN0cvoi6CEvWWcjvAKbBdZkx3cXkar3N8omfaEefvwaWpFAtOA== X-Received: by 2002:a7b:cd8e:: with SMTP id y14mr2877635wmj.61.1612358428462; Wed, 03 Feb 2021 05:20:28 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:28 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Jens Axboe Subject: [stable-4.19 5/8] block: split .sysfs_lock into two locks Date: Wed, 3 Feb 2021 14:20:19 +0100 Message-Id: <20210203132022.92406-6-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store path. Meantime, inside block's .show/.store callback, q->sysfs_lock is required. However, when mq & iosched kobjects are removed via blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held too. This way causes AB-BA lock because the kernfs built-in lock of 'kn-count' is required inside kobject_del() too, see the lockdep warning[1]. On the other hand, it isn't necessary to acquire q->sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue() because clearing REGISTERED flag prevents storing to 'queue/scheduler' from being happened. Also sysfs write(store) is exclusive, so no necessary to hold the lock for elv_unregister_queue() when it is called in switching elevator path. So split .sysfs_lock into two: one is still named as .sysfs_lock for covering sync .store, the other one is named as .sysfs_dir_lock for covering kobjects and related status change. sysfs itself can handle the race between add/remove kobjects and showing/storing attributes under kobjects. For switching scheduler via storing to 'queue/scheduler', we use the queue flag of QUEUE_FLAG_REGISTERED with .sysfs_lock for avoiding the race, then we can avoid to hold .sysfs_lock during removing/adding kobjects. [1] lockdep warning ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc3-00044-g73277fc75ea0 #1380 Not tainted ------------------------------------------------------ rmmod/777 is trying to acquire lock: 00000000ac50e981 (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x59/0x72 but task is already holding lock: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&q->sysfs_lock){+.+.}: __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __mutex_lock+0x14a/0xa9b blk_mq_hw_sysfs_show+0x63/0xb6 sysfs_kf_seq_show+0x11f/0x196 seq_read+0x2cd/0x5f2 vfs_read+0xc7/0x18c ksys_read+0xc4/0x13e do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#202){++++}: check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __kernfs_remove+0x237/0x40b kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->sysfs_lock); lock(kn->count#202); lock(&q->sysfs_lock); lock(kn->count#202); *** DEADLOCK *** 2 locks held by rmmod/777: #0: 00000000e69bd9de (&lock){+.+.}, at: null_exit+0x2e/0x95 [null_blk] #1: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b stack backtrace: CPU: 0 PID: 777 Comm: rmmod Not tainted 5.3.0-rc3-00044-g73277fc75ea0 #1380 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 Call Trace: dump_stack+0x9a/0xe6 check_noncircular+0x207/0x251 ? print_circular_bug+0x32a/0x32a ? find_usage_backwards+0x84/0xb0 check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 ? check_prev_add+0xc45/0xc45 ? mark_lock+0x11b/0x804 ? check_usage_forwards+0x1ca/0x1ca __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 ? kernfs_remove_by_name_ns+0x59/0x72 __kernfs_remove+0x237/0x40b ? kernfs_remove_by_name_ns+0x59/0x72 ? kernfs_next_descendant_post+0x7d/0x7d ? strlen+0x10/0x23 ? strcmp+0x22/0x44 kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa ? disk_events_poll_msecs_store+0x12b/0x12b ? check_flags+0x1ea/0x204 ? mark_held_locks+0x1f/0x7a null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 ? free_module+0x39f/0x39f ? blkcg_maybe_throttle_current+0x8a/0x718 ? rwlock_bug+0x62/0x62 ? __blkcg_punt_bio_submit+0xd0/0xd0 ? trace_hardirqs_on_thunk+0x1a/0x20 ? mark_held_locks+0x1f/0x7a ? do_syscall_64+0x4c/0x295 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb696cdbe6b Code: 73 01 c3 48 8b 0d 1d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 008 RSP: 002b:00007ffec9588788 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000559e589137c0 RCX: 00007fb696cdbe6b RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559e58913828 RBP: 0000000000000000 R08: 00007ffec9587701 R09: 0000000000000000 R10: 00007fb696d4eae0 R11: 0000000000000206 R12: 00007ffec95889b0 R13: 00007ffec95896b3 R14: 0000559e58913260 R15: 0000559e589137c0 Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (jwang:cherry picked from commit cecf5d87ff2035127bb5a9ee054d0023a4a7cad3, adjust ctx for 4,19) Signed-off-by: Jack Wang --- block/blk-core.c | 1 + block/blk-mq-sysfs.c | 12 ++++----- block/blk-sysfs.c | 44 +++++++++++++++++++------------ block/blk.h | 2 +- block/elevator.c | 59 +++++++++++++++++++++++++++++++++++------- include/linux/blkdev.h | 1 + 6 files changed, 86 insertions(+), 33 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index ce3710404544..80f3e729fdd4 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1059,6 +1059,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id, mutex_init(&q->blk_trace_mutex); #endif mutex_init(&q->sysfs_lock); + mutex_init(&q->sysfs_dir_lock); spin_lock_init(&q->__queue_lock); if (!q->mq_ops) diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 5006a0d00990..5e4b7ed1e897 100644 --- a/block/blk-mq-sysfs.c +++ b/block/blk-mq-sysfs.c @@ -264,7 +264,7 @@ void blk_mq_unregister_dev(struct device *dev, struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - lockdep_assert_held(&q->sysfs_lock); + lockdep_assert_held(&q->sysfs_dir_lock); queue_for_each_hw_ctx(q, hctx, i) blk_mq_unregister_hctx(hctx); @@ -312,7 +312,7 @@ int __blk_mq_register_dev(struct device *dev, struct request_queue *q) int ret, i; WARN_ON_ONCE(!q->kobj.parent); - lockdep_assert_held(&q->sysfs_lock); + lockdep_assert_held(&q->sysfs_dir_lock); ret = kobject_add(&q->mq_kobj, kobject_get(&dev->kobj), "%s", "mq"); if (ret < 0) @@ -358,7 +358,7 @@ void blk_mq_sysfs_unregister(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i; - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) goto unlock; @@ -366,7 +366,7 @@ void blk_mq_sysfs_unregister(struct request_queue *q) blk_mq_unregister_hctx(hctx); unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); } int blk_mq_sysfs_register(struct request_queue *q) @@ -374,7 +374,7 @@ int blk_mq_sysfs_register(struct request_queue *q) struct blk_mq_hw_ctx *hctx; int i, ret = 0; - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); if (!q->mq_sysfs_init_done) goto unlock; @@ -385,7 +385,7 @@ int blk_mq_sysfs_register(struct request_queue *q) } unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); return ret; } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 0a7636d24563..b2208b69f04a 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -892,6 +892,7 @@ int blk_register_queue(struct gendisk *disk) int ret; struct device *dev = disk_to_dev(disk); struct request_queue *q = disk->queue; + bool has_elevator = false; if (WARN_ON(!q)) return -ENXIO; @@ -899,7 +900,6 @@ int blk_register_queue(struct gendisk *disk) WARN_ONCE(blk_queue_registered(q), "%s is registering an already registered queue\n", kobject_name(&dev->kobj)); - queue_flag_set_unlocked(QUEUE_FLAG_REGISTERED, q); /* * SCSI probing may synchronously create and destroy a lot of @@ -920,8 +920,7 @@ int blk_register_queue(struct gendisk *disk) if (ret) return ret; - /* Prevent changes through sysfs until registration is completed. */ - mutex_lock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); ret = kobject_add(&q->kobj, kobject_get(&dev->kobj), "%s", "queue"); if (ret < 0) { @@ -934,26 +933,36 @@ int blk_register_queue(struct gendisk *disk) blk_mq_debugfs_register(q); } - kobject_uevent(&q->kobj, KOBJ_ADD); - - wbt_enable_default(q); - - blk_throtl_register_queue(q); - + /* + * The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator + * switch won't happen at all. + */ if (q->request_fn || (q->mq_ops && q->elevator)) { - ret = elv_register_queue(q); + ret = elv_register_queue(q, false); if (ret) { - mutex_unlock(&q->sysfs_lock); - kobject_uevent(&q->kobj, KOBJ_REMOVE); + mutex_unlock(&q->sysfs_dir_lock); kobject_del(&q->kobj); blk_trace_remove_sysfs(dev); kobject_put(&dev->kobj); return ret; } + has_elevator = true; } + + mutex_lock(&q->sysfs_lock); + blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); + wbt_enable_default(q); + blk_throtl_register_queue(q); + + /* Now everything is ready and send out KOBJ_ADD uevent */ + kobject_uevent(&q->kobj, KOBJ_ADD); + if (has_elevator) + kobject_uevent(&q->elevator->kobj, KOBJ_ADD); + mutex_unlock(&q->sysfs_lock); + ret = 0; unlock: - mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); return ret; } EXPORT_SYMBOL_GPL(blk_register_queue); @@ -968,6 +977,7 @@ EXPORT_SYMBOL_GPL(blk_register_queue); void blk_unregister_queue(struct gendisk *disk) { struct request_queue *q = disk->queue; + bool has_elevator; if (WARN_ON(!q)) return; @@ -982,25 +992,27 @@ void blk_unregister_queue(struct gendisk *disk) * concurrent elv_iosched_store() calls. */ mutex_lock(&q->sysfs_lock); - blk_queue_flag_clear(QUEUE_FLAG_REGISTERED, q); + has_elevator = !!q->elevator; + mutex_unlock(&q->sysfs_lock); + mutex_lock(&q->sysfs_dir_lock); /* * Remove the sysfs attributes before unregistering the queue data * structures that can be modified through sysfs. */ if (q->mq_ops) blk_mq_unregister_dev(disk_to_dev(disk), q); - mutex_unlock(&q->sysfs_lock); kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_del(&q->kobj); blk_trace_remove_sysfs(disk_to_dev(disk)); mutex_lock(&q->sysfs_lock); - if (q->request_fn || (q->mq_ops && q->elevator)) + if (q->request_fn || has_elevator) elv_unregister_queue(q); mutex_unlock(&q->sysfs_lock); + mutex_unlock(&q->sysfs_dir_lock); kobject_put(&disk_to_dev(disk)->kobj); } diff --git a/block/blk.h b/block/blk.h index 1a5b67b57e6b..ae87e2a5f2bd 100644 --- a/block/blk.h +++ b/block/blk.h @@ -244,7 +244,7 @@ int elevator_init_mq(struct request_queue *q); int elevator_switch_mq(struct request_queue *q, struct elevator_type *new_e); void elevator_exit(struct request_queue *, struct elevator_queue *); -int elv_register_queue(struct request_queue *q); +int elv_register_queue(struct request_queue *q, bool uevent); void elv_unregister_queue(struct request_queue *q); struct hd_struct *__disk_get_part(struct gendisk *disk, int partno); diff --git a/block/elevator.c b/block/elevator.c index 9bffe4558929..2ff0859e8b35 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -833,13 +833,16 @@ static struct kobj_type elv_ktype = { .release = elevator_release, }; -int elv_register_queue(struct request_queue *q) +/* + * elv_register_queue is called from either blk_register_queue or + * elevator_switch, elevator switch is prevented from being happen + * in the two paths, so it is safe to not hold q->sysfs_lock. + */ +int elv_register_queue(struct request_queue *q, bool uevent) { struct elevator_queue *e = q->elevator; int error; - lockdep_assert_held(&q->sysfs_lock); - error = kobject_add(&e->kobj, &q->kobj, "%s", "iosched"); if (!error) { struct elv_fs_entry *attr = e->type->elevator_attrs; @@ -850,26 +853,36 @@ int elv_register_queue(struct request_queue *q) attr++; } } - kobject_uevent(&e->kobj, KOBJ_ADD); + if (uevent) + kobject_uevent(&e->kobj, KOBJ_ADD); + + mutex_lock(&q->sysfs_lock); e->registered = 1; if (!e->uses_mq && e->type->ops.sq.elevator_registered_fn) e->type->ops.sq.elevator_registered_fn(q); + mutex_unlock(&q->sysfs_lock); } return error; } +/* + * elv_unregister_queue is called from either blk_unregister_queue or + * elevator_switch, elevator switch is prevented from being happen + * in the two paths, so it is safe to not hold q->sysfs_lock. + */ void elv_unregister_queue(struct request_queue *q) { - lockdep_assert_held(&q->sysfs_lock); - if (q) { struct elevator_queue *e = q->elevator; kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); + + mutex_lock(&q->sysfs_lock); e->registered = 0; /* Re-enable throttling in case elevator disabled it */ wbt_enable_default(q); + mutex_unlock(&q->sysfs_lock); } } @@ -940,10 +953,32 @@ int elevator_switch_mq(struct request_queue *q, lockdep_assert_held(&q->sysfs_lock); if (q->elevator) { - if (q->elevator->registered) + if (q->elevator->registered) { + mutex_unlock(&q->sysfs_lock); + + /* + * Concurrent elevator switch can't happen becasue + * sysfs write is always exclusively on same file. + * + * Also the elevator queue won't be freed after + * sysfs_lock is released becasue kobject_del() in + * blk_unregister_queue() waits for completion of + * .store & .show on its attributes. + */ elv_unregister_queue(q); + + mutex_lock(&q->sysfs_lock); + } ioc_clear_queue(q); elevator_exit(q, q->elevator); + + /* + * sysfs_lock may be dropped, so re-check if queue is + * unregistered. If yes, don't switch to new elevator + * any more + */ + if (!blk_queue_registered(q)) + return 0; } ret = blk_mq_init_sched(q, new_e); @@ -951,7 +986,11 @@ int elevator_switch_mq(struct request_queue *q, goto out; if (new_e) { - ret = elv_register_queue(q); + mutex_unlock(&q->sysfs_lock); + + ret = elv_register_queue(q, true); + + mutex_lock(&q->sysfs_lock); if (ret) { elevator_exit(q, q->elevator); goto out; @@ -1047,7 +1086,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) if (err) goto fail_init; - err = elv_register_queue(q); + err = elv_register_queue(q, true); if (err) goto fail_register; @@ -1067,7 +1106,7 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e) /* switch failed, restore and re-register old elevator */ if (old) { q->elevator = old; - elv_register_queue(q); + elv_register_queue(q, true); blk_queue_bypass_end(q); } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 3a2b34c2c82b..209ba8e7bd31 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -637,6 +637,7 @@ struct request_queue { struct delayed_work requeue_work; struct mutex sysfs_lock; + struct mutex sysfs_dir_lock; int bypass_depth; atomic_t mq_freeze_depth; From patchwork Wed Feb 3 13:20:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 376868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79EF9C433E6 for ; Wed, 3 Feb 2021 13:21:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2178764F4C for ; Wed, 3 Feb 2021 13:21:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229774AbhBCNVu (ORCPT ); Wed, 3 Feb 2021 08:21:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229979AbhBCNVq (ORCPT ); Wed, 3 Feb 2021 08:21:46 -0500 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2BCDC06178B for ; Wed, 3 Feb 2021 05:20:30 -0800 (PST) Received: by mail-wm1-x329.google.com with SMTP id y187so5204708wmd.3 for ; Wed, 03 Feb 2021 05:20:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6IdRX02f71VtSCOgLB7umSVkmw3Wa3ffAZk4G0tZrdA=; b=dlNTm1aqGFC81Rn4kmr+1KJkS4hA6/aCVr1rPI/p9RywE+IfFVudr5pGMBnzxs8DFW Pk8Uk8UXYC+yE7ffRyXcT8McVFJIvG7rOzfW/oUW0tj2LxGOIuzwsxkz7umL/5RcdT4D 2j32kmf6ryTLbOZ8Gy37QsOYAd21Ahil4JebHEjVsG2UaXH+LEWngyhgCCikdhg1KfDZ p8wmW5BT/HljPOJyZjr58OgHs/wm7sQk5TcS7P0K0wej45mF40XtAQSQNyruE0ePw+QX oT5Q5UFq6qemVJWVEqus4tuHSLB1VhqlisZWE8KI4UOamqBGu1G+i5fjGk8iX59Lb3BA daKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6IdRX02f71VtSCOgLB7umSVkmw3Wa3ffAZk4G0tZrdA=; b=AJdO5T+JiK1JzG9wJJz9EHAR6cAR3F5fI5iiPeSpTNAIfg61Dsp78TmG8jkCyE47nJ 0eXm5qBmgqkDREZ16ealoflrCvWqNwfV9rJrmenaWeIanbI9W7KF9hsWPQYZf+S7EGVl 9SvCfLG96oxav3RzGI/Gtwv0VNf9rasxCKdxiL4Fa8GWLEUJiBSL+70fuSb9cok41KzY MbcEWAxLvbX87Xffl5qCPnYnln3tkAAI1ubR1313mFPd0URFvhlHkkkclyVEVi3l5xnG yRCOyht3EMgEh4BA2vfSQO610crlcYxlYPYSUZkBL4TaiknKT6ZN1Afc6ot5skKjLVHl dWeA== X-Gm-Message-State: AOAM530Ijrosw8+ZApkTvhJrBdx7e/MILybtlGvwrdhhSQaa0M01KOE/ 4g0q7v+7aAg90Vcs0S2qn7mK0w== X-Google-Smtp-Source: ABdhPJwQu49P67ungbgd8YZkJy4L4TmjLDyJe5dNpVkywT4LPBFMefMunV/Cbn5V2uR8ttlw5P1bSg== X-Received: by 2002:a1c:99d0:: with SMTP id b199mr1261715wme.147.1612358429504; Wed, 03 Feb 2021 05:20:29 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:29 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Jens Axboe Subject: [stable-4.19 6/8] block: fix race between switching elevator and removing queues Date: Wed, 3 Feb 2021 14:20:20 +0100 Message-Id: <20210203132022.92406-7-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to release & actuire sysfs_lock again during switching elevator. So it isn't enough to prevent switching elevator from happening by simply clearing QUEUE_FLAG_REGISTERED with holding sysfs_lock, because in-progress switch still can move on after re-acquiring the lock, meantime the flag of QUEUE_FLAG_REGISTERED won't get checked. Fixes this issue by checking 'q->elevator' directly & locklessly after q->kobj is removed in blk_unregister_queue(), this way is safe because q->elevator can't be changed at that time. Fixes: cecf5d87ff20 ("block: split .sysfs_lock into two locks") Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Cc: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (cherry picked from commit 0a67b5a926e63ff5492c3c675eab5900580d056d) Signed-off-by: Jack Wang --- block/blk-sysfs.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index b2208b69f04a..899987152701 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -977,7 +977,6 @@ EXPORT_SYMBOL_GPL(blk_register_queue); void blk_unregister_queue(struct gendisk *disk) { struct request_queue *q = disk->queue; - bool has_elevator; if (WARN_ON(!q)) return; @@ -993,7 +992,6 @@ void blk_unregister_queue(struct gendisk *disk) */ mutex_lock(&q->sysfs_lock); blk_queue_flag_clear(QUEUE_FLAG_REGISTERED, q); - has_elevator = !!q->elevator; mutex_unlock(&q->sysfs_lock); mutex_lock(&q->sysfs_dir_lock); @@ -1009,7 +1007,11 @@ void blk_unregister_queue(struct gendisk *disk) blk_trace_remove_sysfs(disk_to_dev(disk)); mutex_lock(&q->sysfs_lock); - if (q->request_fn || has_elevator) + /* + * q->kobj has been removed, so it is safe to check if elevator + * exists without holding q->sysfs_lock. + */ + if (q->request_fn || q->elevator) elv_unregister_queue(q); mutex_unlock(&q->sysfs_lock); mutex_unlock(&q->sysfs_dir_lock); From patchwork Wed Feb 3 13:20:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 375960 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 070AAC433DB for ; Wed, 3 Feb 2021 13:21:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E3A864F55 for ; Wed, 3 Feb 2021 13:21:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229979AbhBCNVw (ORCPT ); Wed, 3 Feb 2021 08:21:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230248AbhBCNVr (ORCPT ); Wed, 3 Feb 2021 08:21:47 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDF45C06178C for ; Wed, 3 Feb 2021 05:20:31 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id q7so24229029wre.13 for ; Wed, 03 Feb 2021 05:20:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/Q85vcvTRwqOQln7ONod8UaPXYEVKrd8O7nB76luOHM=; b=VSq/Tq18U/g3RagnsHbxoGrjQ8RXYNWfZd088OHvmvNTjj0C1OQzv2cg/hVjaMZhgI kzmPE3oYHs4sCpo8uXlqnO/+jtSFWZH7NVXZ7TaKuZa07s64d5dQ1akwp/eWK/RvVADk VVMuO+Ck8G9o0CjYv4hTsBvIQoxhVXI92NajpyqmPSlKg4cCyY/pOqXSIYLFxve1S5to +0vHk5YTf2YML+u+pFYYIeQGTW53bi4HD1cJRzYD3JoAH2Hj5Pb3OoJznSSjab1w7KNZ 1NZN2ilKxixJieb0ohM3Er5vy7HtD0MmnuHVrdDBHFz2xNZwEVOSNRJdsgQWv0b42xi4 YHlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/Q85vcvTRwqOQln7ONod8UaPXYEVKrd8O7nB76luOHM=; b=dRisQ7+EaX1TTs/u2J7frl4029eMS7FlHWZ09BRhc25eEI6uN/U1lHtGaUCiTIYAnb vyZlu57inb46H34oPl6jRaYF4vJfdzqgkU5PP1w/eN0GH+ZAsrrDL9RH5/yk2vaNquaP wS2ghx96BhIARaj7gyt28iTSftZYe6n6NlngojvYbypEFtuhepPr5SV7x95R0yuAAPno PPXmjhIbRC/PbajiaEQc1A5izW1jRFnm6v6kA6l3XJUtkmm5wVdNCtzLPvHanyZ07hc6 KBewUosmSjWtgshtEsevSgu2goTBq/FSCIAPaKVbPjQxBRN9j8zcb+xdP3Dh0FPc7tjV Gkfg== X-Gm-Message-State: AOAM532LMg/PpVY8HFr7MH23YZJ4fCjByY5HAokaotT34c/b6NrwTNEA xxz6W1vffp2e/EW25NK+vQOgQqMe7DpapA== X-Google-Smtp-Source: ABdhPJzalN3t5nMp4frtKnib2fiQWkX/JacxAxKNJO/UWKT+gUZWvw+by6KGDUFpR8xfpLgP7XEJeA== X-Received: by 2002:a5d:4046:: with SMTP id w6mr3425430wrp.369.1612358430535; Wed, 03 Feb 2021 05:20:30 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:30 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Ming Lei , Christoph Hellwig , Hannes Reinecke , Mike Snitzer , Bart Van Assche , Jens Axboe Subject: [stable-4.19 7/8] block: don't release queue's sysfs lock during switching elevator Date: Wed, 3 Feb 2021 14:20:21 +0100 Message-Id: <20210203132022.92406-8-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ming Lei cecf5d87ff20 ("block: split .sysfs_lock into two locks") starts to release & acquire sysfs_lock before registering/un-registering elevator queue during switching elevator for avoiding potential deadlock from showing & storing 'queue/iosched' attributes and removing elevator's kobject. Turns out there isn't such deadlock because 'q->sysfs_lock' isn't required in .show & .store of queue/iosched's attributes, and just elevator's sysfs lock is acquired in elv_iosched_store() and elv_iosched_show(). So it is safe to hold queue's sysfs lock when registering/un-registering elevator queue. The biggest issue is that commit cecf5d87ff20 assumes that concurrent write on 'queue/scheduler' can't happen. However, this assumption isn't true, because kernfs_fop_write() only guarantees that concurrent write aren't called on the same open file, but the write could be from different open on the file. So we can't release & re-acquire queue's sysfs lock during switching elevator, otherwise use-after-free on elevator could be triggered. Fixes the issue by not releasing queue's sysfs lock during switching elevator. Fixes: cecf5d87ff20 ("block: split .sysfs_lock into two locks") Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Greg KH Cc: Mike Snitzer Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Jens Axboe (jwang:cherry picked from commit b89f625e28d44552083f43752f62d8621ded0a04 adjust ctx for 4.19) Signed-off-by: Jack Wang --- block/blk-sysfs.c | 3 ++- block/elevator.c | 31 +------------------------------ 2 files changed, 3 insertions(+), 31 deletions(-) diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 899987152701..07494deb1a26 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -933,6 +933,7 @@ int blk_register_queue(struct gendisk *disk) blk_mq_debugfs_register(q); } + mutex_lock(&q->sysfs_lock); /* * The flag of QUEUE_FLAG_REGISTERED isn't set yet, so elevator * switch won't happen at all. @@ -940,6 +941,7 @@ int blk_register_queue(struct gendisk *disk) if (q->request_fn || (q->mq_ops && q->elevator)) { ret = elv_register_queue(q, false); if (ret) { + mutex_unlock(&q->sysfs_lock); mutex_unlock(&q->sysfs_dir_lock); kobject_del(&q->kobj); blk_trace_remove_sysfs(dev); @@ -949,7 +951,6 @@ int blk_register_queue(struct gendisk *disk) has_elevator = true; } - mutex_lock(&q->sysfs_lock); blk_queue_flag_set(QUEUE_FLAG_REGISTERED, q); wbt_enable_default(q); blk_throtl_register_queue(q); diff --git a/block/elevator.c b/block/elevator.c index 2ff0859e8b35..5b51bc5fad9f 100644 --- a/block/elevator.c +++ b/block/elevator.c @@ -856,11 +856,9 @@ int elv_register_queue(struct request_queue *q, bool uevent) if (uevent) kobject_uevent(&e->kobj, KOBJ_ADD); - mutex_lock(&q->sysfs_lock); e->registered = 1; if (!e->uses_mq && e->type->ops.sq.elevator_registered_fn) e->type->ops.sq.elevator_registered_fn(q); - mutex_unlock(&q->sysfs_lock); } return error; } @@ -878,11 +876,9 @@ void elv_unregister_queue(struct request_queue *q) kobject_uevent(&e->kobj, KOBJ_REMOVE); kobject_del(&e->kobj); - mutex_lock(&q->sysfs_lock); e->registered = 0; /* Re-enable throttling in case elevator disabled it */ wbt_enable_default(q); - mutex_unlock(&q->sysfs_lock); } } @@ -953,32 +949,11 @@ int elevator_switch_mq(struct request_queue *q, lockdep_assert_held(&q->sysfs_lock); if (q->elevator) { - if (q->elevator->registered) { - mutex_unlock(&q->sysfs_lock); - - /* - * Concurrent elevator switch can't happen becasue - * sysfs write is always exclusively on same file. - * - * Also the elevator queue won't be freed after - * sysfs_lock is released becasue kobject_del() in - * blk_unregister_queue() waits for completion of - * .store & .show on its attributes. - */ + if (q->elevator->registered) elv_unregister_queue(q); - mutex_lock(&q->sysfs_lock); - } ioc_clear_queue(q); elevator_exit(q, q->elevator); - - /* - * sysfs_lock may be dropped, so re-check if queue is - * unregistered. If yes, don't switch to new elevator - * any more - */ - if (!blk_queue_registered(q)) - return 0; } ret = blk_mq_init_sched(q, new_e); @@ -986,11 +961,7 @@ int elevator_switch_mq(struct request_queue *q, goto out; if (new_e) { - mutex_unlock(&q->sysfs_lock); - ret = elv_register_queue(q, true); - - mutex_lock(&q->sysfs_lock); if (ret) { elevator_exit(q, q->elevator); goto out; From patchwork Wed Feb 3 13:20:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinpu Wang X-Patchwork-Id: 376867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8816EC433E0 for ; Wed, 3 Feb 2021 13:21:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3CD9E64F55 for ; Wed, 3 Feb 2021 13:21:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231191AbhBCNVy (ORCPT ); Wed, 3 Feb 2021 08:21:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230248AbhBCNVx (ORCPT ); Wed, 3 Feb 2021 08:21:53 -0500 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8BEAC061793 for ; Wed, 3 Feb 2021 05:20:32 -0800 (PST) Received: by mail-wr1-x435.google.com with SMTP id c12so24266486wrc.7 for ; Wed, 03 Feb 2021 05:20:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloud.ionos.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EjmAutv15qjx2SSCkoRfIH3tCUPNl1e/xGS93BoqYcw=; b=TDUC8WIbHPI7UuTJJhAxsWWgUaMAvxRdo+om8WRkn3Sg9DFbiZldqsXGKOR6xovBkv ExLFvSMdYHLmPdHxiCEC9K3XNqBIRMfGGMQvSWPxEbptAn5gstpOvhUjMqDGzv/16vE2 FAeCcJhe5xCvWem+dW2bzzrZZiPIeAQbmnk/7ylKtmiHEK4aurCNjicBfEPKYkQzNKET Vk4AQa2BaY8hkX+IorxsUwKW/89faFNpTzW63tZzwuBWcymlIUmtEYkyDejGNG871v+M vZseIsdFxZFQWvcVE6BljhSNRCCDL0HOoZ0dHXHHEF7SEM124mCwWUXTx6y+bg4fjjbf 8S7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EjmAutv15qjx2SSCkoRfIH3tCUPNl1e/xGS93BoqYcw=; b=APeGuA3hx9zF9ayNhF+qutnsKOkM+66XAbsjJkp1js8GqkDg6UrefU0a4aHiMVtmR0 /9MklEHHjC+SUMMSFGUrHnGkJsVGZ41o4akOTM43PFKQGbrVVWPO1jje0tmOH3LImECY BXKZwdZ6vXDnKonEZLEZCn3htXdT6tcGhuNK0pSd7Y/a6hSA5qqkcgW7Xt41h/3puMRg IId6baTr44oyronvQqHLHdzIlAsbwyDh/TWx9XwGr2LORN80ntz6x0KLg15pUtqSpMcG r/sw3gCkxergwbJ65N5Ax01BwJvHyIsHPhRJGjFgCTJj69bAluYj8I6sEktJEgCQV8YO I9YA== X-Gm-Message-State: AOAM530/41n2IzISO2yJ5xaJT292LNmESPF7kkaBlU0DC+JSqShAcNjE IxERaqvMKJ2+mJllbFOkijrjBw== X-Google-Smtp-Source: ABdhPJzKTnlsvFNop5UVZm8avHjVd9Y3pngfXtNQiKlz5VAYbEoMmUKHTAJuRnhQv1aY6JMOr7Iv2g== X-Received: by 2002:a05:6000:8c:: with SMTP id m12mr3518190wrx.101.1612358431487; Wed, 03 Feb 2021 05:20:31 -0800 (PST) Received: from jwang-Latitude-5491.fritz.box ([2001:16b8:4947:3c00:7cf4:7472:391:9d0d]) by smtp.gmail.com with ESMTPSA id d10sm3738430wrn.88.2021.02.03.05.20.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Feb 2021 05:20:30 -0800 (PST) From: Jack Wang To: gregkh@linuxfoundation.org, sashal@kernel.org, stable@vger.kernel.org Cc: Xiao Ni , David Jeffery , Song Liu Subject: [stable-4.19 8/8] md: Set prev_flush_start and flush_bio in an atomic way Date: Wed, 3 Feb 2021 14:20:22 +0100 Message-Id: <20210203132022.92406-9-jinpu.wang@cloud.ionos.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> References: <20210203132022.92406-1-jinpu.wang@cloud.ionos.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Xiao Ni One customer reports a crash problem which causes by flush request. It triggers a warning before crash. /* new request after previous flush is completed */ if (ktime_after(req_start, mddev->prev_flush_start)) { WARN_ON(mddev->flush_bio); mddev->flush_bio = bio; bio = NULL; } The WARN_ON is triggered. We use spin lock to protect prev_flush_start and flush_bio in md_flush_request. But there is no lock protection in md_submit_flush_data. It can set flush_bio to NULL first because of compiler reordering write instructions. For example, flush bio1 sets flush bio to NULL first in md_submit_flush_data. An interrupt or vmware causing an extended stall happen between updating flush_bio and prev_flush_start. Because flush_bio is NULL, flush bio2 can get the lock and submit to underlayer disks. Then flush bio1 updates prev_flush_start after the interrupt or extended stall. Then flush bio3 enters in md_flush_request. The start time req_start is behind prev_flush_start. The flush_bio is not NULL(flush bio2 hasn't finished). So it can trigger the WARN_ON now. Then it calls INIT_WORK again. INIT_WORK() will re-initialize the list pointers in the work_struct, which then can result in a corrupted work list and the work_struct queued a second time. With the work list corrupted, it can lead in invalid work items being used and cause a crash in process_one_work. We need to make sure only one flush bio can be handled at one same time. So add spin lock in md_submit_flush_data to protect prev_flush_start and flush_bio in an atomic way. Reviewed-by: David Jeffery Signed-off-by: Xiao Ni Signed-off-by: Song Liu [jwang: backport dc5d17a3c39b06aef866afca19245a9cfb533a79 to 4.19] Signed-off-by: Jack Wang --- drivers/md/md.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index 80ca13594c18..09f0d8e70b70 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -474,8 +474,10 @@ static void md_submit_flush_data(struct work_struct *ws) * could wait for this and below md_handle_request could wait for those * bios because of suspend check */ + spin_lock_irq(&mddev->lock); mddev->last_flush = mddev->start_flush; mddev->flush_bio = NULL; + spin_unlock_irq(&mddev->lock); wake_up(&mddev->sb_wait); if (bio->bi_iter.bi_size == 0) {