From patchwork Wed Mar 10 11:30:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 397437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DFB6C433DB for ; Wed, 10 Mar 2021 11:37:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2928B64FE1 for ; Wed, 10 Mar 2021 11:37:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232585AbhCJLfz (ORCPT ); Wed, 10 Mar 2021 06:35:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231854AbhCJLe7 (ORCPT ); Wed, 10 Mar 2021 06:34:59 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36467C061761 for ; Wed, 10 Mar 2021 03:34:52 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id v15so22941864wrx.4 for ; Wed, 10 Mar 2021 03:34:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RRM3mpG3Dc8xCV556D/wCxQjpf5H994qHluUojN4Jlc=; b=osAb5VDh2V7Nw2/t9PLjkvzBirkQUQWt85Xu+aObNF0R0QR3NhOkrbaMDm7EzGvUVl Ej8/2uhgnkwmOaeeSx+H2G44EpVYXJYtX/TLSaL0bHiyhvWTXZFHEhhZiJFZOCfzGGuh oHAL5NO7DGw0/CX8MT3JCyD4I0APjBJYFPgkqndxvtckirKguz/OZSJtYuSacR4KQFfL HXtda6NbtIDawqo68dhj2TZDrkHNbvPeS4vJiM0P+TODYTqYSQ7iFI2WXwDQ0w+WWZaj ehLVPo0GLzR5Z4wcGVeh7gj5Wx7K6gkEb3gCBHr7pXzNEODGPYaUwM0lOz2BFR7anDgj ObCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RRM3mpG3Dc8xCV556D/wCxQjpf5H994qHluUojN4Jlc=; b=nyAxEPSc8nP4fMHDTLMveZWuRyOdlS7IM49x963NLyRBlWhu+ajj9qmOX4XK/Ohouw xPLbwkM/doBPYYwJMZIc5xN1KoPlX45mie6xi6FgvxpMjSOECzDUqDFmrizcPJ2/zBs2 og12KepThNmdoz4dui2z5lx2sFOCTl26d5MBUgJjLN+ku6GMbk0OAwsZApn4H3sqnKzM eQBJl9/ljhmNB9WvmCTnQ92jVtplj1qrs6WhQprjXV9z47zh5ZVO84Hrmq2h+pudmWOC FcgQRM4J/VNXDSLXFv+1Y1ULsEyS3FzXJ60me/f3RKjcoIOZPQz0ZXhn6bpRnXJL0T1s IFcQ== X-Gm-Message-State: AOAM530oZP1E3gwQT7Yu8Tb74TdgAfwAOoLY3yCwp81IoH7BMq9KkSdG 2PPCRosIKVVRyyEziOrRPqKZatectYb9zg== X-Google-Smtp-Source: ABdhPJyQYrLIpSM0QglSa4T+M8cjTTpXdmbAwz1OuMJ6I8ZycGU6K5zssnFXCVxFgElDgaeeHYIuVQ== X-Received: by 2002:a5d:4523:: with SMTP id j3mr3169212wra.288.1615376090748; Wed, 10 Mar 2021 03:34:50 -0800 (PST) Received: from localhost.localdomain ([85.255.232.55]) by smtp.gmail.com with ESMTPSA id s18sm25179078wrr.27.2021.03.10.03.34.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 03:34:50 -0800 (PST) From: Pavel Begunkov To: stable@vger.kernel.org Cc: Jens Axboe , syzbot+81d17233a2b02eafba33@syzkaller.appspotmail.com Subject: [PATCH 1/9] io_uring: fix inconsistent lock state Date: Wed, 10 Mar 2021 11:30:37 +0000 Message-Id: <780db85414287452e1c4d208b2a1920760cad721.1615375332.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commin 9ae1f8dd372e0e4c020b345cf9e09f519265e981 upstream WARNING: inconsistent lock state inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. syz-executor217/8450 [HC1[1]:SC0[0]:HE0:SE1] takes: ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline] ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_req_clean_work fs/io_uring.c:1398 [inline] ffff888023d6e620 (&fs->lock){?.+.}-{2:2}, at: io_dismantle_req+0x66f/0xf60 fs/io_uring.c:2029 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&fs->lock); lock(&fs->lock); *** DEADLOCK *** 1 lock held by syz-executor217/8450: #0: ffff88802417c3e8 (&ctx->uring_lock){+.+.}-{3:3}, at: __do_sys_io_uring_enter+0x1071/0x1f30 fs/io_uring.c:9442 stack backtrace: CPU: 1 PID: 8450 Comm: syz-executor217 Not tainted 5.11.0-rc5-next-20210129-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: [...] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] io_req_clean_work fs/io_uring.c:1398 [inline] io_dismantle_req+0x66f/0xf60 fs/io_uring.c:2029 __io_free_req+0x3d/0x2e0 fs/io_uring.c:2046 io_free_req fs/io_uring.c:2269 [inline] io_double_put_req fs/io_uring.c:2392 [inline] io_put_req+0xf9/0x570 fs/io_uring.c:2388 io_link_timeout_fn+0x30c/0x480 fs/io_uring.c:6497 __run_hrtimer kernel/time/hrtimer.c:1519 [inline] __hrtimer_run_queues+0x609/0xe40 kernel/time/hrtimer.c:1583 hrtimer_interrupt+0x334/0x940 kernel/time/hrtimer.c:1645 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1085 [inline] __sysvec_apic_timer_interrupt+0x146/0x540 arch/x86/kernel/apic/apic.c:1102 asm_call_irq_on_stack+0xf/0x20 __run_sysvec_on_irqstack arch/x86/include/asm/irq_stack.h:37 [inline] run_sysvec_on_irqstack_cond arch/x86/include/asm/irq_stack.h:89 [inline] sysvec_apic_timer_interrupt+0xbd/0x100 arch/x86/kernel/apic/apic.c:1096 asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:629 RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:169 [inline] RIP: 0010:_raw_spin_unlock_irq+0x25/0x40 kernel/locking/spinlock.c:199 spin_unlock_irq include/linux/spinlock.h:404 [inline] io_queue_linked_timeout+0x194/0x1f0 fs/io_uring.c:6525 __io_queue_sqe+0x328/0x1290 fs/io_uring.c:6594 io_queue_sqe+0x631/0x10d0 fs/io_uring.c:6639 io_queue_link_head fs/io_uring.c:6650 [inline] io_submit_sqe fs/io_uring.c:6697 [inline] io_submit_sqes+0x19b5/0x2720 fs/io_uring.c:6960 __do_sys_io_uring_enter+0x107d/0x1f30 fs/io_uring.c:9443 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Don't free requests from under hrtimer context (softirq) as it may sleep or take spinlocks improperly (e.g. non-irq versions). Cc: stable@vger.kernel.org # 5.6+ Reported-by: syzbot+81d17233a2b02eafba33@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io_uring.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 38bfd168ad3b..a1d08b641d0f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6506,9 +6506,10 @@ static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer) if (prev) { req_set_fail_links(prev); io_async_find_and_cancel(ctx, req, prev->user_data, -ETIME); - io_put_req(prev); + io_put_req_deferred(prev, 1); } else { - io_req_complete(req, -ETIME); + io_cqring_add_event(req, -ETIME, 0); + io_put_req_deferred(req, 1); } return HRTIMER_NORESTART; } From patchwork Wed Mar 10 11:30:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 397434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90292C4332B for ; Wed, 10 Mar 2021 11:37:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6FB6B64FE8 for ; Wed, 10 Mar 2021 11:37:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231819AbhCJLf5 (ORCPT ); Wed, 10 Mar 2021 06:35:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231790AbhCJLfP (ORCPT ); Wed, 10 Mar 2021 06:35:15 -0500 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E1B9C061764 for ; Wed, 10 Mar 2021 03:34:56 -0800 (PST) Received: by mail-wr1-x42b.google.com with SMTP id w11so22932297wrr.10 for ; Wed, 10 Mar 2021 03:34:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pJP+2SES5veTHN1w2o+EooIOS5gGA8fO2YFXyg3pw3k=; b=bJJXB6Vym4PhGlvMEU0Tjth09CDe/RIr/A0zbOvzEvdo3pR5H/4xsCbT+l+5scb1Fu HXn8Lf7jdhwycWH2zRq2m9ghenosD6K3bnYazuCQLp3ZqVoWuuNTtiQWLOp8o3jdxvV1 jovwJRsJdfCKDEPGuKLlhQly35eTqh7v0Vmg8JQ7vyshCHVIyzla9YizFSwD/HGr4/f/ vCkEAKB1fHUYD5Bw623OQELT/VTYRFvXcUIcFMeTpuly5CrOWt1FXMbNBro+y08IOrMh iuajUvQ65Lwr1nEoGZ2wA69OafNEBwmY66yszDrRhIe6i4CKgRj7sXTQYRNge3wgGG+d XS2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pJP+2SES5veTHN1w2o+EooIOS5gGA8fO2YFXyg3pw3k=; b=KwSnfgpy7SIk4AZ638D4ZrHC9rRvII6q/9xiaLw5V9RICWM334wRqDDTycHNzvh+0Y DPBxrtxE7oZQFiqJF198+fNcgB/pmBP6OdPtjXfyYt8t1BYi0Q2MC3MjpkXXOn+bTgWs nT9oTtfiagCPzpKcGC3dEAWUXWEPnrrauL5pzw45+fBa9SanrxzE4o6Ca3HdNPGHcZLD A1k5oTh26TWLOAIf1zFz7xippqGvkhRaGLpys24hBXzxLrkdYo1aC0W31p8Q7Pg0/S/o s0S10LM5o5bNg3lY7eY6Qru3FCwyWbUZPr7v6F5S3N0nwdQhkyRUWsfY7wvcFK+1RKqV TUZw== X-Gm-Message-State: AOAM5316WP18a4o1WwZEk3lTQJTUnsTuEuhPQueD8vvdnw3WFn8xib+J EPvxH+7yK7bmZ9ijrqCZVbD030ADKa0lpw== X-Google-Smtp-Source: ABdhPJxYjuE8Fe7YtbTYNGICQhvp76DOZ/+erhK6IPt7moLD9MuojIaEjBaeKtjUnUsxPoliiCNPkw== X-Received: by 2002:adf:90c2:: with SMTP id i60mr3031463wri.75.1615376094477; Wed, 10 Mar 2021 03:34:54 -0800 (PST) Received: from localhost.localdomain ([85.255.232.55]) by smtp.gmail.com with ESMTPSA id s18sm25179078wrr.27.2021.03.10.03.34.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 03:34:54 -0800 (PST) From: Pavel Begunkov To: stable@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/9] io_uring: deduplicate failing task_work_add Date: Wed, 10 Mar 2021 11:30:40 +0000 Message-Id: <5ad81cd57c41877a4667ea8dd5397987af6cce41.1615375332.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit eab30c4d20dc761d463445e5130421863ff81505 upstream When io_req_task_work_add() fails, the request will be cancelled by enqueueing via task_works of io-wq. Extract a function for that. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io_uring.c | 46 +++++++++++++++++----------------------------- 1 file changed, 17 insertions(+), 29 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 842a7c017296..bc76929e0031 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2172,6 +2172,16 @@ static int io_req_task_work_add(struct io_kiocb *req) return ret; } +static void io_req_task_work_add_fallback(struct io_kiocb *req, + void (*cb)(struct callback_head *)) +{ + struct task_struct *tsk = io_wq_get_task(req->ctx->io_wq); + + init_task_work(&req->task_work, cb); + task_work_add(tsk, &req->task_work, TWA_NONE); + wake_up_process(tsk); +} + static void __io_req_task_cancel(struct io_kiocb *req, int error) { struct io_ring_ctx *ctx = req->ctx; @@ -2229,14 +2239,8 @@ static void io_req_task_queue(struct io_kiocb *req) percpu_ref_get(&req->ctx->refs); ret = io_req_task_work_add(req); - if (unlikely(ret)) { - struct task_struct *tsk; - - init_task_work(&req->task_work, io_req_task_cancel); - tsk = io_wq_get_task(req->ctx->io_wq); - task_work_add(tsk, &req->task_work, TWA_NONE); - wake_up_process(tsk); - } + if (unlikely(ret)) + io_req_task_work_add_fallback(req, io_req_task_cancel); } static inline void io_queue_next(struct io_kiocb *req) @@ -2354,13 +2358,8 @@ static void io_free_req_deferred(struct io_kiocb *req) init_task_work(&req->task_work, io_put_req_deferred_cb); ret = io_req_task_work_add(req); - if (unlikely(ret)) { - struct task_struct *tsk; - - tsk = io_wq_get_task(req->ctx->io_wq); - task_work_add(tsk, &req->task_work, TWA_NONE); - wake_up_process(tsk); - } + if (unlikely(ret)) + io_req_task_work_add_fallback(req, io_put_req_deferred_cb); } static inline void io_put_req_deferred(struct io_kiocb *req, int refs) @@ -3439,15 +3438,8 @@ static int io_async_buf_func(struct wait_queue_entry *wait, unsigned mode, /* submit ref gets dropped, acquire a new one */ refcount_inc(&req->refs); ret = io_req_task_work_add(req); - if (unlikely(ret)) { - struct task_struct *tsk; - - /* queue just for cancelation */ - init_task_work(&req->task_work, io_req_task_cancel); - tsk = io_wq_get_task(req->ctx->io_wq); - task_work_add(tsk, &req->task_work, TWA_NONE); - wake_up_process(tsk); - } + if (unlikely(ret)) + io_req_task_work_add_fallback(req, io_req_task_cancel); return 1; } @@ -5159,12 +5151,8 @@ static int __io_async_wake(struct io_kiocb *req, struct io_poll_iocb *poll, */ ret = io_req_task_work_add(req); if (unlikely(ret)) { - struct task_struct *tsk; - WRITE_ONCE(poll->canceled, true); - tsk = io_wq_get_task(req->ctx->io_wq); - task_work_add(tsk, &req->task_work, TWA_NONE); - wake_up_process(tsk); + io_req_task_work_add_fallback(req, func); } return 1; } From patchwork Wed Mar 10 11:30:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 397436 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5238C433E0 for ; Wed, 10 Mar 2021 11:37:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9646A64FD7 for ; Wed, 10 Mar 2021 11:37:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231682AbhCJLf5 (ORCPT ); Wed, 10 Mar 2021 06:35:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232078AbhCJLf3 (ORCPT ); Wed, 10 Mar 2021 06:35:29 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDD81C0613D7 for ; Wed, 10 Mar 2021 03:34:58 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id u14so22932645wri.3 for ; Wed, 10 Mar 2021 03:34:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tsF7/5uGGNtKSeIbUwB9la5ui6d5Vch6n8mUko4oifU=; b=Auw9Zw9jKS8DjfOydaOYYfkT7wNR/YfmfzfeUyIkm/UcBnm2hYa1IXxyQhYOYg2mq9 aExirm2SZm4YW9sW946Hw/i8NxZcBJfqaJ7cRxHsDh8dq93pU4WmUyXcsEGBHQ5QcMyD pNmUrGDmSZyIutHdOFTnSY6pk3LABI2CPc90hg/WC37li7Svg3UF1SV4XNN5Rk4sW05y yLG6FoE6YyMFpIX0XnsW2Ley/bTMewIGoN3z7YKM5dwwrfs0oEghdfqFHbDjVwOJgoJv M6qAKTv31+NjF5uc/EXoyW6sSaoRr90HcqeenIy49wpvwCbF3C8fDu79AQflqBFo+ULM 1EsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tsF7/5uGGNtKSeIbUwB9la5ui6d5Vch6n8mUko4oifU=; b=pS/Pgg5etGK7wELEYHfbuouZ/+EL5dlPs5PfPMGa4yjj/f9z33NRxa6PFRp6voHTXo pXvxQHWRaMR2mupK32jNs2rQJ8QupVc2z3EBDIAtym77gbisDjNImPA383rIopABAXLX puOKVxzHePIbbECe0BGllDNI8bloaGSaOVDtX1Kpc6qcYJGBNvneziJI1HfYH9J4Jmlo WuIp8vrFxdQ3/s/2WBwOrYTu7iKQwe1eufgQzv5h/Jp7+BFf0AepNOGUIlQo+lXGeX8n VaMOAIIfvzCi42A8Krvg1Qm8izYVLBeFXrhyvCUmMmOtB2vGmL579RorbD07Acpwvw44 +o7g== X-Gm-Message-State: AOAM5300+xMUoMh06Uc8cyDHIvRJJ5wshvz0GYpYq54sZZTcaAgjoGzp EwpPirk3xpn8rxrqS1fjrpQZeAFDTC1vpQ== X-Google-Smtp-Source: ABdhPJzDSLbvNNaCCJdfSfX+tqb9Si6KvtT7k6+YrVqpHphwhwAD61FfyBXh50NCW94KbIAkgjnq9w== X-Received: by 2002:adf:ba87:: with SMTP id p7mr3140027wrg.298.1615376097306; Wed, 10 Mar 2021 03:34:57 -0800 (PST) Received: from localhost.localdomain ([85.255.232.55]) by smtp.gmail.com with ESMTPSA id s18sm25179078wrr.27.2021.03.10.03.34.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 03:34:56 -0800 (PST) From: Pavel Begunkov To: stable@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 6/9] io_uring: get rid of intermediate IORING_OP_CLOSE stage Date: Wed, 10 Mar 2021 11:30:42 +0000 Message-Id: <89cbacd2635e9e91db0139cf2d3906621afa399a.1615375332.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jens Axboe commit 9eac1904d3364254d622bf2c771c4f85cd435fc2 upstream We currently split the close into two, in case we have a ->flush op that we can't safely handle from non-blocking context. This requires us to flag the op as uncancelable if we do need to punt it async, and that means special handling for just this op type. Use __close_fd_get_file() and grab the files lock so we can get the file and check if we need to go async in one atomic operation. That gets rid of the need for splitting this into two steps, and hence the need for IO_WQ_WORK_NO_CANCEL. Signed-off-by: Jens Axboe Signed-off-by: Pavel Begunkov --- fs/io_uring.c | 64 ++++++++++++++++++++++++++++----------------------- 1 file changed, 35 insertions(+), 29 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index bc76929e0031..7d03689d0e47 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -411,7 +411,6 @@ struct io_poll_remove { struct io_close { struct file *file; - struct file *put_file; int fd; }; @@ -908,8 +907,6 @@ static const struct io_op_def io_op_defs[] = { IO_WQ_WORK_FS | IO_WQ_WORK_MM, }, [IORING_OP_CLOSE] = { - .needs_file = 1, - .needs_file_no_error = 1, .work_flags = IO_WQ_WORK_FILES | IO_WQ_WORK_BLKCG, }, [IORING_OP_FILES_UPDATE] = { @@ -4473,13 +4470,6 @@ static int io_statx(struct io_kiocb *req, bool force_nonblock) static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { - /* - * If we queue this for async, it must not be cancellable. That would - * leave the 'file' in an undeterminate state, and here need to modify - * io_wq_work.flags, so initialize io_wq_work firstly. - */ - io_req_init_async(req); - if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL)) return -EINVAL; if (sqe->ioprio || sqe->off || sqe->addr || sqe->len || @@ -4489,43 +4479,59 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) return -EBADF; req->close.fd = READ_ONCE(sqe->fd); - if ((req->file && req->file->f_op == &io_uring_fops)) - return -EBADF; - - req->close.put_file = NULL; return 0; } static int io_close(struct io_kiocb *req, bool force_nonblock, struct io_comp_state *cs) { + struct files_struct *files = current->files; struct io_close *close = &req->close; + struct fdtable *fdt; + struct file *file; int ret; - /* might be already done during nonblock submission */ - if (!close->put_file) { - ret = close_fd_get_file(close->fd, &close->put_file); - if (ret < 0) - return (ret == -ENOENT) ? -EBADF : ret; + file = NULL; + ret = -EBADF; + spin_lock(&files->file_lock); + fdt = files_fdtable(files); + if (close->fd >= fdt->max_fds) { + spin_unlock(&files->file_lock); + goto err; + } + file = fdt->fd[close->fd]; + if (!file) { + spin_unlock(&files->file_lock); + goto err; + } + + if (file->f_op == &io_uring_fops) { + spin_unlock(&files->file_lock); + file = NULL; + goto err; } /* if the file has a flush method, be safe and punt to async */ - if (close->put_file->f_op->flush && force_nonblock) { - /* not safe to cancel at this point */ - req->work.flags |= IO_WQ_WORK_NO_CANCEL; - /* was never set, but play safe */ - req->flags &= ~REQ_F_NOWAIT; - /* avoid grabbing files - we don't need the files */ - req->flags |= REQ_F_NO_FILE_TABLE; + if (file->f_op->flush && force_nonblock) { + spin_unlock(&files->file_lock); return -EAGAIN; } + ret = __close_fd_get_file(close->fd, &file); + spin_unlock(&files->file_lock); + if (ret < 0) { + if (ret == -ENOENT) + ret = -EBADF; + goto err; + } + /* No ->flush() or already async, safely close from here */ - ret = filp_close(close->put_file, req->work.identity->files); + ret = filp_close(file, current->files); +err: if (ret < 0) req_set_fail_links(req); - fput(close->put_file); - close->put_file = NULL; + if (file) + fput(file); __io_req_complete(req, ret, 0, cs); return 0; } From patchwork Wed Mar 10 11:30:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 397433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA6F2C433E0 for ; Wed, 10 Mar 2021 11:37:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7101764FE4 for ; Wed, 10 Mar 2021 11:37:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231517AbhCJLhJ (ORCPT ); Wed, 10 Mar 2021 06:37:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232083AbhCJLf3 (ORCPT ); Wed, 10 Mar 2021 06:35:29 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 073D3C0613D9 for ; Wed, 10 Mar 2021 03:35:01 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id l22so6892099wme.1 for ; Wed, 10 Mar 2021 03:35:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=e0qX8oRYdOPph/4W4xmNZG5c74khoGvEvzU/QA9qYKA=; b=gapA0ca+Y1SYDDu0q4iL1GSJyQEqoFiBYp+6/trl34fq+fFu8/9572B62oZzyxXoVg ulqgpwsI4wzVbcCf/OxoWDb1KCDr9UznA28t7aRMOb7I8VeERQ2ZYR4aXlGcPiwHTVOa YteYMMD7WSvixOHGOK6m17W7RVA8qVCB51argq7ePWEKIgEbUPzzdalrjhI+gIJdw55t 23I1Ilg9oweAwA33jSkCjHGzs3ox7JiY9Wv6Iy4vHQsVvJCm+WJO7Ynns0udL1n4utN/ 0krx3nusSmntQ2KF+EZhxP+ojvMhWZBAV6ss4vOIr0gQzag5LO+FZS0d9HqmebsjGkxW 6gYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=e0qX8oRYdOPph/4W4xmNZG5c74khoGvEvzU/QA9qYKA=; b=GV51U82BnsWN7XpUoVCcNhxGKpZPjXIQn//1NJ5fQQC/iZgEJXBIT4AMJFQwmCdkwm ZjpUQsZ+nhQ2B4q/7q1Phdgp6O8KeWBR9SoxlnThH4TAG1nKuZ7wIdEN8ZryJuACmXI9 x2LslwnWaITit/e1oYqxQe5uPKzaorwUrqBaVYCALdyukHkbO2K8ZdY8gIYpVNiKlsNh 0M+tBXZ58aS6TKaumy4jPFNuBoC44EJ+u++tdOafbAGlXqJXckdscmeDWHkXr+VlAval lDxeSifqciBhU5ou0zByFV2dvwfLosEeq54zBLPtovJQLheXmSua27r3ZjfOseAFf11S eeOw== X-Gm-Message-State: AOAM533tkRwBXeNAFAD8Ud4TGoMQVHVACFXVqNxIWAY5FOZ1VE+6jnN5 +Zl7xFlKQevB7stTNgVRxraZV6/I9Bxf4A== X-Google-Smtp-Source: ABdhPJzs2JetqZRH5lfH9FA7LkbfVfyHhsyLQ2ngSF2AUbKJyuXXPvoZlAqNbBnOTgP8sV0sNgOHcA== X-Received: by 2002:a05:600c:35c1:: with SMTP id r1mr2885355wmq.60.1615376099417; Wed, 10 Mar 2021 03:34:59 -0800 (PST) Received: from localhost.localdomain ([85.255.232.55]) by smtp.gmail.com with ESMTPSA id s18sm25179078wrr.27.2021.03.10.03.34.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Mar 2021 03:34:59 -0800 (PST) From: Pavel Begunkov To: stable@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 8/9] io_uring/io-wq: return 2-step work swap scheme Date: Wed, 10 Mar 2021 11:30:44 +0000 Message-Id: <506ec0ce0b991836bb5132840fd1889126c86c8e.1615375332.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.24.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 5280f7e530f71ba85baf90169393196976ad0e52 upstream Saving one lock/unlock for io-wq is not super important, but adds some ugliness in the code. More important, atomic decs not turning it to zero for some archs won't give the right ordering/barriers so the io_steal_work() may pretty easily get subtly and completely broken. Return back 2-step io-wq work exchange and clean it up. Signed-off-by: Pavel Begunkov Signed-off-by: Jens Axboe --- fs/io-wq.c | 16 ++++++---------- fs/io-wq.h | 4 ++-- fs/io_uring.c | 26 ++++---------------------- 3 files changed, 12 insertions(+), 34 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 2e2f14f42bf2..63ef195b1acb 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -555,23 +555,21 @@ static void io_worker_handle_work(struct io_worker *worker) /* handle a whole dependent link */ do { - struct io_wq_work *old_work, *next_hashed, *linked; + struct io_wq_work *next_hashed, *linked; unsigned int hash = io_get_work_hash(work); next_hashed = wq_next_work(work); io_impersonate_work(worker, work); + wq->do_work(work); + io_assign_current_work(worker, NULL); - old_work = work; - linked = wq->do_work(work); - + linked = wq->free_work(work); work = next_hashed; if (!work && linked && !io_wq_is_hashed(linked)) { work = linked; linked = NULL; } io_assign_current_work(worker, work); - wq->free_work(old_work); - if (linked) io_wqe_enqueue(wqe, linked); @@ -850,11 +848,9 @@ static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe) struct io_wq *wq = wqe->wq; do { - struct io_wq_work *old_work = work; - work->flags |= IO_WQ_WORK_CANCEL; - work = wq->do_work(work); - wq->free_work(old_work); + wq->do_work(work); + work = wq->free_work(work); } while (work); } diff --git a/fs/io-wq.h b/fs/io-wq.h index e1ffb80a4a1d..e37a0f217cc8 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -106,8 +106,8 @@ static inline struct io_wq_work *wq_next_work(struct io_wq_work *work) return container_of(work->list.next, struct io_wq_work, list); } -typedef void (free_work_fn)(struct io_wq_work *); -typedef struct io_wq_work *(io_wq_work_fn)(struct io_wq_work *); +typedef struct io_wq_work *(free_work_fn)(struct io_wq_work *); +typedef void (io_wq_work_fn)(struct io_wq_work *); struct io_wq_data { struct user_struct *user; diff --git a/fs/io_uring.c b/fs/io_uring.c index 5ebc05f41c19..5e9bff1eeaa0 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2365,22 +2365,6 @@ static inline void io_put_req_deferred(struct io_kiocb *req, int refs) io_free_req_deferred(req); } -static struct io_wq_work *io_steal_work(struct io_kiocb *req) -{ - struct io_kiocb *nxt; - - /* - * A ref is owned by io-wq in which context we're. So, if that's the - * last one, it's safe to steal next work. False negatives are Ok, - * it just will be re-punted async in io_put_work() - */ - if (refcount_read(&req->refs) != 1) - return NULL; - - nxt = io_req_find_next(req); - return nxt ? &nxt->work : NULL; -} - static void io_double_put_req(struct io_kiocb *req) { /* drop both submit and complete references */ @@ -6378,7 +6362,7 @@ static int io_issue_sqe(struct io_kiocb *req, bool force_nonblock, return 0; } -static struct io_wq_work *io_wq_submit_work(struct io_wq_work *work) +static void io_wq_submit_work(struct io_wq_work *work) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); struct io_kiocb *timeout; @@ -6429,8 +6413,6 @@ static struct io_wq_work *io_wq_submit_work(struct io_wq_work *work) if (lock_ctx) mutex_unlock(&lock_ctx->uring_lock); } - - return io_steal_work(req); } static inline struct file *io_file_from_index(struct io_ring_ctx *ctx, @@ -8062,12 +8044,12 @@ static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg, return __io_sqe_files_update(ctx, &up, nr_args); } -static void io_free_work(struct io_wq_work *work) +static struct io_wq_work *io_free_work(struct io_wq_work *work) { struct io_kiocb *req = container_of(work, struct io_kiocb, work); - /* Consider that io_steal_work() relies on this ref */ - io_put_req(req); + req = io_put_req_find_next(req); + return req ? &req->work : NULL; } static int io_init_wq_offload(struct io_ring_ctx *ctx,