From patchwork Thu Apr 14 13:09:09 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
X-Patchwork-Id: 562593
Return-Path: <stable-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
 aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
 by smtp.lore.kernel.org (Postfix) with ESMTP id C0850C433F5
 for <stable@archiver.kernel.org>; Thu, 14 Apr 2022 13:18:09 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S244010AbiDNNUb (ORCPT <rfc822;stable@archiver.kernel.org>);
 Thu, 14 Apr 2022 09:20:31 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39044 "EHLO
 lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S243894AbiDNNTH (ORCPT
 <rfc822;stable@vger.kernel.org>); Thu, 14 Apr 2022 09:19:07 -0400
Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217])
 by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE60C92D20;
 Thu, 14 Apr 2022 06:16:29 -0700 (PDT)
Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by dfw.source.kernel.org (Postfix) with ESMTPS id 585AD610A6;
 Thu, 14 Apr 2022 13:16:29 +0000 (UTC)
Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5630AC385A5;
 Thu, 14 Apr 2022 13:16:28 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org;
 s=korg; t=1649942188;
 bh=ppHpru4VsQzJxITMdZEUGtbM4eqf231wGeMtdnZLn78=;
 h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
 b=g64+seu6nn0fCpDQHFU4ryrIzKOnqie9553zscwy7EC5ruxqFWXFbL2qovqTZVU7C
 fbt6zUQ7v6jqiqAKKdj9MyFP6sZ6yN9BwnBrH8W2Bba20wLCvm45oIaUvfnifNUTtS
 vSrHoEsYhHbjt8PdgzuizKxO/ScKj4tNrvyWkFHQ=
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org,
 Lars Ellenberg <lars.ellenberg@linbit.com>, =?utf-8?q?Christoph_B=C3=B6hmwa?=
 =?utf-8?q?lder?=  <christoph.boehmwalder@linbit.com>,
 Jens Axboe <axboe@kernel.dk>
Subject: [PATCH 4.19 046/338] drbd: fix potential silent data corruption
Date: Thu, 14 Apr 2022 15:09:09 +0200
Message-Id: <20220414110840.207572972@linuxfoundation.org>
X-Mailer: git-send-email 2.35.2
In-Reply-To: <20220414110838.883074566@linuxfoundation.org>
References: <20220414110838.883074566@linuxfoundation.org>
User-Agent: quilt/0.66
MIME-Version: 1.0
Precedence: bulk
List-ID: <stable.vger.kernel.org>
X-Mailing-List: stable@vger.kernel.org

From: Lars Ellenberg <lars.ellenberg@linbit.com>

commit f4329d1f848ac35757d9cc5487669d19dfc5979c upstream.

Scenario:
---------

bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.

We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.

Reproducer:
-----------

How to trigger this in the real world within seconds:

DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").

Cause significant read ahead.

Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.

For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.

Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: <stable@vger.kernel.org> # 4.13.x
Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/drbd/drbd_req.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -207,7 +207,8 @@ void start_new_tl_epoch(struct drbd_conn
 void complete_master_bio(struct drbd_device *device,
 		struct bio_and_error *m)
 {
-	m->bio->bi_status = errno_to_blk_status(m->error);
+	if (unlikely(m->error))
+		m->bio->bi_status = errno_to_blk_status(m->error);
 	bio_endio(m->bio);
 	dec_ap_bio(device);
 }