From patchwork Mon Jun 10 10:43:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 804098 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D07B978C7F; Mon, 10 Jun 2024 10:44:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016288; cv=fail; b=J1baw8DzXDweFxh1MFD7x4rADFJGpp+UzEEuOTo5jHbuiv1HDeE5Ge1r+7hoJJsqA1AB+HklrpGX+lGlpaRsldMYxaJyh8SwoY5+WjzBFImuLgaukUWeocv6UTVz2QhKETifOrv2yn3Jm5IBX0HbH2EN/L1og35StLJntXl45Q0= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016288; c=relaxed/simple; bh=2REKH1/1aH+bUNgzT8a7bGP4wBuTGd/jyvCeOusuGPI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=pm6x9uk9UZD5gm0KJTHzGQPTuY0BCMBw/Sxfq/sFs3LqZfxg10HVI1h6593Qiu6+fHYtJup7OFxzvjHUZkJWjilGGYaAkjKmb6TW75tWO+oAek6Yg4+Y/CTpBLcE9OSgLd3QVJ83oQ6JoVuCIc1dwL+B6IdSiXoZ0ZD87oqoIM4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=kW1yM6A6; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=p00unFTU; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="kW1yM6A6"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="p00unFTU" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45A4BPF2025193; Mon, 10 Jun 2024 10:43:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=Se0yQckEFgV6c0qz2c9mlgxv8OJqOdTuW+nbG+OxEhw=; b= kW1yM6A60bYOQjtf3HssMB5HtCT4Jdp7a71zF0TwUCo9nmzrUa3OTTqhRi7e0IKi MR4E3QmwEk4G25xuaNL/gzHJRqAP5eHTygTGOMp4fiWoTnntlmFGl2X6cxHerLZW HFtazfgK1woWnh60/MIpdrjm/heXZufMIP2oZsPDdPJvmov5z0d1ufZhtJUurVsK bTZo2hGaPmsjKDOrUkD55be3Q183MCSPi5w1sReAykvqkTX3aqMApMBq8jxbWb3I +Mf++68a6NtlJm0BMJhAo3X3OseSjeZIKXr5QJNVS2lTWoUD89kmI7Qf14AjwFXN eKbMZGO4c2VTnuP9sHuTGw== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3ymh1gaag4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:43:54 +0000 (GMT) Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45AA37Ox020460; Mon, 10 Jun 2024 10:43:54 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2169.outbound.protection.outlook.com [104.47.59.169]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3yncasued1-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:43:54 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=B+Ox//Ab72Zcgw8IPcaDu8+PDh7bJeRgyPtt/rbyguahygeLi4OxzgcIEjL+W9AvhxDxe9sUH5hDAyBCOBiB2cWefBmZ2fCG8kpFTPYBPTRQCZMXJYjNm5aV4XZXM7uWV0Uc6Rf8eK3Cd+jjcn+C59FrRmN/CY+MlnM6bx8z6wUzX0Kpw20FUFLVEeIZ9GcAoCOkw5wTvkz4hUc3H2mlHnAwbFFZBWI+BRFailnrK9m+igLXEPC2hdlU2d0Q9Q7G6BBPbAyW78qN6Bnop3mCpKvqLHGHuWsuvoB+YR2HuB8J1e1DwC9UqH0h7+jdzjDYGpfQA9ewLfD03dOm37SaNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Se0yQckEFgV6c0qz2c9mlgxv8OJqOdTuW+nbG+OxEhw=; b=VygWhANcsblwvR9jSggENppq2WLccd02S9f5aZ++W2IGPpM/Ac9iHmDJg2evTbsHVJunP9YFsdPw6J0nTmjRK/XZEvrvctoNjr2BaQ/v49BeASFgoIeulUV2aA9HDKBqg2olkawzd4aDiTzdgayzgsGHsSyfGKhnnDOjMCQgZDOLyfbJA/tajBGT8OZj1UvbCCkq+Cm960W/EjYHQeQT7jLT0hm40d1WBgidKTdHD1e0Lb0A1lsN2aT/nkqCTfXrvuVs4ODwcAg5Vn0jAGpC3abd//W/SEbXgkMt3Fdgrep84onD68TCqYfzEwUX29b2/ox/5zF7853ZNVOxWrGJ5w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Se0yQckEFgV6c0qz2c9mlgxv8OJqOdTuW+nbG+OxEhw=; b=p00unFTU0ewOlA1VWnxiC4X1bciIWvq/wWOcmyXAvKaHfRlizsgZMJLI3zAa37rnXYfHssvUQAAog6B8EUnDWJzbFeylKUmmMgfcJAssQPa3REUCoOrTw2Mq1ioGc8Al0fEmL+OBu81Wrlv5OaugFChZVa69p27lsAaPaLdhjyE= Received: from DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) by SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7633.31; Mon, 10 Jun 2024 10:43:52 +0000 Received: from DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088]) by DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088%6]) with mapi id 15.20.7633.036; Mon, 10 Jun 2024 10:43:52 +0000 From: John Garry To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz Cc: djwong@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ojaswin@linux.ibm.com, linux-aio@kvack.org, linux-btrfs@vger.kernel.org, io-uring@vger.kernel.org, nilay@linux.ibm.com, ritesh.list@gmail.com, willy@infradead.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, hare@suse.de, John Garry Subject: [PATCH v8 02/10] block: Generalize chunk_sectors support as boundary support Date: Mon, 10 Jun 2024 10:43:21 +0000 Message-Id: <20240610104329.3555488-3-john.g.garry@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240610104329.3555488-1-john.g.garry@oracle.com> References: <20240610104329.3555488-1-john.g.garry@oracle.com> X-ClientProxiedBy: BLAPR03CA0006.namprd03.prod.outlook.com (2603:10b6:208:32b::11) To DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR10MB4313:EE_|SJ0PR10MB5613:EE_ X-MS-Office365-Filtering-Correlation-Id: 04073c2e-b906-4729-a7c3-08dc893a37e2 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230031|7416005|376005|1800799015|366007|921011; X-Microsoft-Antispam-Message-Info: RjEouuTsp6Q6BVZusy3spFSNHSZvLQjy2b5258Mvy/4xVneBu7QAWn5glM4AjBeqzO5SdLECmEasrfraXummgbkGLmid47EDfc51NGwENHZPuBfGl0f1npGWjSxSc/IaEB7y5+PJm4ts501i+ArbBRLYdsoimsSZRePW++EeNprVbkEhsJdrTWM0lzrbtu2JUFJAOSQ0I6Nq1kH+GUC4EZ3RY4tWLQ3IbIGkReaoxDSSCKbp3c1K8sWfn0Z54TGECyTz7pUGwFNv11MLjqzFZyp40dkneiNFKc+nwMxyGSiHiHj8LIwKTwR6m8GieOcQS2yZEy/gY5Zyb1YVXReM/viiQaht+YmjESLYZIjQzQeAzgZdu1AbfPhFovKK3wgYokuRzlEY+fmbcQGeKn50S86cLds0uTQO4lJxs435VV4J4kmVer72x4ny3iEx8AXmPlxilUtC97sgPOo4rNIbAmYmDsmCx1vrlL+LWsAB/X0ypV0IgiHrpzH9APosMHsHt9VRC+Mp5OE0GcDtMzZCe4Avt440GXvu+8rrl9evBSnITvtH008VdRKWcLOBJqlm89cY0jl6G+zT7nmmoz4AGb/vx2MZU6kGRN2c6aMQCFPL8PvmR2IDu1k0QmBOFBIHZh611IUlvNWzqQQjbTX546O1g5s7Ehunu0e3D0az249nW+N81vVErSaJoARETZMmzcgxt8yFPsN3PbJSIeak6WH6K8fB5dRlR7+llpQ1aQiy+PCXot13ICU87xHhXiQ2Sc/SrOrGBJXuE1kwGpZXeqmfRq9/3vONKhv2G/jFBxiyTZ3Qeazd+/+QrmrP/AV/siaR/x7J3Mdelzou9AzrNmqSa1DksrmMoPLthkc7LcNyLEJjvkc7h6+K+yJQw+nT/65a1UOHA2sAOXAXziIScVQyxqXkFHrCExWnTMxCvbeU0eFBzx2l2FZZz7OEMRapg/PB7u158t5DcgUk8pWM5ZYhFiRYisu9Aq/YBrYQ26V2x/9OMyEMsk7eKU3s+Hyr3Mt/LhDGtmCf/Qe2gwG6bwkf8a6SgbLRwfjvyFG9rzY3GfSMHyezUq6AeDbhFrskzwxoDtbC6Tpegroc394HPuU5Vl3CZBX0ShUZ8qLR90aREam0xeGEW+utfzY8jjjLku2f2AOdTC3z5KCylmcWVwfQalJh5LWfz0c3GSINF5rkksCZN56SserUxPnP/oy+x6VeObezNVVdt1Cz72b/rjyU8XYtYnwplpwdmdSzZPgOJwnrbZgIiSw/dPCyOdYAwz21WoQBgUc9wfyBbw1C84TqwEeRP4rrmV88Eq2dpAjFwjt5YQ7ph4z4304lzb1zPe0BxsK+7uyfJt91H8NKK7T6tsyOdOnUCwT7RUWggAI= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR10MB4313.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(7416005)(376005)(1800799015)(366007)(921011); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 7l+1CBgcHJ9vJa741CYlh6mMblzCdnzncYxYZQ3iOVjElva/EEl96hLH4wySzVzzDgE4E9GtTyZe6XHvYywNhY5RlCDezhbEa+VePDoO/P9s5Eqx9Tkgl6GY54CiDg7z+STDlaZsB6ep13dAipLCqgE3z1MWOHKofpxE3vFTr6kfx2AGZ2dZ+qK/fpyGo/Pfn9N9vEKgTtIK9iPPj4tpCdgawhg1sg9qGuAWmi5YLy78KXrk/4DxP1HAFPrS4te8c7YP/sz/RorS7ugZNM1pUlUqYFdBW6pDrp6n3tnDaROQIeQB9LhHAMxuD5fwiqqjkL1l5kVqktP/UJCq8i1P3FxrQLAS4NQ0lznLPJyPebZpmO7EAle80lqlhNDpq7AqLtjVhH18pxyw+fSTF7Xpb8VzgGJgr+MdJblC9+Fiw0hxesQqQCDVCEBCXxnQWF+VKWgKO0oBv2TMv7HyxnpKb0fn0zE4kPdFuBJVDI0qHqjKut3p5MAelm2RsVLDnIP4IEHKshF+LHzDOsM9P78oaLUm6w2oNb3XXTIqWdXlF1YjoWVfHTvUs68m+A+Gcsbbbj9seiA1jNgRvH7BZY76w4ikavqbWTrRxchrniUw8sno7XRQ1gz80oIB5VILx7wNmvHJaxGaT0tB83rbViAr13k0X+krWNbthnEPB4Oee18pxxURU9t4KhdGrhON6lP1HFW70dj/gLwbmbpEUyxRo5bwRJwE1mPDQwYOIZAbpgkjaihP1aZvAHNHD2gej5pJL+rpnKp02Box3MnAnasVlSO3esOYDX4TKuOuiIFMVZD6j4aKE7U4pD0xXPa1G+ZosyI/MAY+8N/ecam38HPYbDwHbvUJP8IycUP8YhdUtAIC5dS7nX3DiIQN2xt4ig4GwhQ7CvJEj21ujYHA9kE8sRSbNnrViDf9ipHnFt3/BpRpTSsoIyUZ8cz1C90qcUskmsLnm9N2zUJe7W/SHE4GYG3LXxU1I1+lAK+7Bqzo1+6Bky43DGBXzhfXIUmXOFlvlbKYRe7ImRe/UJ8kT7LAzYhQ2JMnsVWAToRd/LTbu2FN6era4ZcPD0XTdDBxwSHaCE/70nBjyAs0iiqGajw2KHAHFd16G50G2CLMBw0w/oYvP1PONNRc1aRJa/ksGn5nbDqHv/BOq8284k57joydet0kEw/tvTczuunQIGGVb1mBDii5HeIW6OZ0LL8stmAgCyniqg/p47bnYM2Ysxq9bdVFMFwfN2LXTZQOl+T6e5lhvrdYiduy3M9Kug6SiiTaiNAtvGilFLTvfRM8IjiRq5QX1syVtpJBDd7OoG5m4Ne8FnizGiyuQCwp9wTWo1cvcPlpYNLHBOmJli1X/JEnJLAFrByVucsniIzFcR/DXsanEYT4hpCLZ/EcWlSGr5o9kvoe3Nj8wr4kxmWVS4ylq9TbE37RM5eOWiAf/LJlC+2Eqk4yUFKI6wwqZsLM+B8cQm1E3E8EVf57/1VbF926i0pdsBzCsKJZ2SYA+geR6f+HjBSdGBu13sO2RcMkfmaxAw55Fh1tGs7hkmxOSRX+16p2bDRIlxJhOXBJwbKo+lx8uFCZNDJalR5NIw7l+a+QvRDbq9vfuCh633GuK7GZFw== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: HvGP5XZ5G/fNaSmWb2PBPkDgq/H7ELydxHL4O499H0l4Cv2ItmDBZobfBSR6LuXyBbvp8PKrXfOSLRH4PpElu5Nlls6k21HnsIlzEkinDL8Xd3TC58oABa+6DUGB7RLDUf7yQWjo4L35RsoDCjpSzJNvVFhOY/KEnrvuDo8nFCWol5FG40CVU5nx/Bi0ql52hJxfumeKAn3hZGUewRFSyxwgMB8u/EcvVWMivq8h/cfK7+QfCj68PJ7bdjexTxU+daG0I1RYUkf9oa30UwRn3Mv9fr7l+LeABpuxyTtG1xWL6gAPLNhRyEVz08TlfjsiJTpaCdAGTkfSbefynD2bFlX3fv1Z8tODOx/HyQCWNDydEtEqqpNwijEnsUnSAnal28PnVAMGnoURAX/qNmtHdKxkmkKcUeHx/JsExTdCGAUT6HHNkxPwFz1sa6ffnR7D1DO4yJAKZQ2U+nbXGjp3GcIFFOnCCZ5Ssb/msllW0Zl9CYBILpE1wrtcn4jl6VPMgZkIR3qAWiaz5VZ62wEWn9U7rBoD5h6WTfo4fLDp9MplEreuoZrGNzZhjqdnFEI0QyotqFj6A7QKNjA39upIv7oKiADjp4sAO8S/aOj1PlA= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 04073c2e-b906-4729-a7c3-08dc893a37e2 X-MS-Exchange-CrossTenant-AuthSource: DM6PR10MB4313.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jun 2024 10:43:51.9969 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0Siuw/y8FIFlD1NJmZZFs66++Q62kDMJu2GFncg2BE6NtKh69bfuVq4fDHwJ8kV3XhZJytIdI3GiijgcOL4Xdw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB5613 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-10_02,2024-06-10_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxscore=0 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406100081 X-Proofpoint-ORIG-GUID: 1t_bsR_n6Mlcw1PlpQzuYtZZMV80vz5K X-Proofpoint-GUID: 1t_bsR_n6Mlcw1PlpQzuYtZZMV80vz5K The purpose of the chunk_sectors limit is to ensure that a mergeble request fits within the boundary of the chunck_sector value. Such a feature will be useful for other request_queue boundary limits, so generalize the chunk_sectors merge code. This idea was proposed by Hannes Reinecke. Signed-off-by: John Garry --- block/blk-merge.c | 20 ++++++++++++++------ drivers/md/dm.c | 2 +- include/linux/blkdev.h | 13 +++++++------ 3 files changed, 22 insertions(+), 13 deletions(-) diff --git a/block/blk-merge.c b/block/blk-merge.c index 8957e08e020c..68969e27c831 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -154,6 +154,11 @@ static struct bio *bio_split_write_zeroes(struct bio *bio, return bio_split(bio, lim->max_write_zeroes_sectors, GFP_NOIO, bs); } +static inline unsigned int blk_boundary_sectors(const struct queue_limits *lim) +{ + return lim->chunk_sectors; +} + /* * Return the maximum number of sectors from the start of a bio that may be * submitted as a single request to a block device. If enough sectors remain, @@ -167,12 +172,13 @@ static inline unsigned get_max_io_size(struct bio *bio, { unsigned pbs = lim->physical_block_size >> SECTOR_SHIFT; unsigned lbs = lim->logical_block_size >> SECTOR_SHIFT; + unsigned boundary_sectors = blk_boundary_sectors(lim); unsigned max_sectors = lim->max_sectors, start, end; - if (lim->chunk_sectors) { + if (boundary_sectors) { max_sectors = min(max_sectors, - blk_chunk_sectors_left(bio->bi_iter.bi_sector, - lim->chunk_sectors)); + blk_boundary_sectors_left(bio->bi_iter.bi_sector, + boundary_sectors)); } start = bio->bi_iter.bi_sector & (pbs - 1); @@ -588,19 +594,21 @@ static inline unsigned int blk_rq_get_max_sectors(struct request *rq, sector_t offset) { struct request_queue *q = rq->q; - unsigned int max_sectors; + struct queue_limits *lim = &q->limits; + unsigned int max_sectors, boundary_sectors; if (blk_rq_is_passthrough(rq)) return q->limits.max_hw_sectors; + boundary_sectors = blk_boundary_sectors(lim); max_sectors = blk_queue_get_max_sectors(rq); - if (!q->limits.chunk_sectors || + if (!boundary_sectors || req_op(rq) == REQ_OP_DISCARD || req_op(rq) == REQ_OP_SECURE_ERASE) return max_sectors; return min(max_sectors, - blk_chunk_sectors_left(offset, q->limits.chunk_sectors)); + blk_boundary_sectors_left(offset, boundary_sectors)); } static inline int ll_new_hw_segment(struct request *req, struct bio *bio, diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 13037d6a6f62..b648253c2300 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1188,7 +1188,7 @@ static sector_t __max_io_len(struct dm_target *ti, sector_t sector, return len; return min_t(sector_t, len, min(max_sectors ? : queue_max_sectors(ti->table->md->queue), - blk_chunk_sectors_left(target_offset, max_granularity))); + blk_boundary_sectors_left(target_offset, max_granularity))); } static inline sector_t max_io_len(struct dm_target *ti, sector_t sector) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index ac8e0cb2353a..ddff90766f9f 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -866,14 +866,15 @@ static inline bool bio_straddles_zones(struct bio *bio) } /* - * Return how much of the chunk is left to be used for I/O at a given offset. + * Return how much within the boundary is left to be used for I/O at a given + * offset. */ -static inline unsigned int blk_chunk_sectors_left(sector_t offset, - unsigned int chunk_sectors) +static inline unsigned int blk_boundary_sectors_left(sector_t offset, + unsigned int boundary_sectors) { - if (unlikely(!is_power_of_2(chunk_sectors))) - return chunk_sectors - sector_div(offset, chunk_sectors); - return chunk_sectors - (offset & (chunk_sectors - 1)); + if (unlikely(!is_power_of_2(boundary_sectors))) + return boundary_sectors - sector_div(offset, boundary_sectors); + return boundary_sectors - (offset & (boundary_sectors - 1)); } /** From patchwork Mon Jun 10 10:43:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 804096 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4640139563; Mon, 10 Jun 2024 10:44:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.165.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016295; cv=fail; b=PHagimgTZnRRxRrjK3xxSjd1n9OyXvE1RLwKJiI8nEhHq4GBBN3VBPJuHpPEmLH+MxtJmXDLgC+MMMNa53nUKg9GcNmrODW4NNaeNBcqFvOtSlMWaNIptt5CFi7vAvHI01+BhCJVu3d4rvq5UIQZCb1rEKGxVCCSUz1SCeY6ywk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016295; c=relaxed/simple; bh=5/bFNX2HlMfsKCpFbHV8K1O9l2Ek5stP7KrzYC8R7Bw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=mRf+QvGjp37/M3h0vl0wiV16VX7zbWzWrwsBLMPDPY5QtNLrFH+c4JvnWI1PIWfGyvCW1qduWdfP/QZwMlHCpCcDTDNfxPPqwdorl+4ODOQICnTMxaD8pvyE6JtVxrAs2fbIBV0KTI/qbrYME3SH46h5/nJXV58TvW4ZTSi81UM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=hTNc+YEW; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=gORaHUzQ; arc=fail smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="hTNc+YEW"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="gORaHUzQ" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45A4BSKj006761; Mon, 10 Jun 2024 10:44:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=TbanG9OOchDY9/z2mDu4fqQnev6OFwdKu+Z6IOGBWmw=; b= hTNc+YEWRRa5w+ibiHckZif9bzmWjXAw8QpH8zn/GDelgGgwqrF3l56Bk1O4ZYuM AkYsRZqSznoc4I3Sb/5u098zNTqgWfGGWonNcTXzxaV1RU/pMx+edCI0xh6AWFfp 8LI7EN2ovFc2WiHOfd1Wf2uYgl/V3vesLT00YBXr6Zpky7bvzSgVf36UuyHW91nj 7OVf4Q+xIkQXfPXQ81ULzcojzSI/xN53mKnACSqKt1ZcE9eeXeRLHUMZ+gtiqgS1 bhC7uG1KdFpF09SrtwZY/m9HkV4CzPPhU8x1Zvb0uE7lovogNSrhLZbTfJox052S rH/FuzW8/v4JafNwVoCWNA== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3ymhf1a8t2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:43:59 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45A9B996012520; Mon, 10 Jun 2024 10:43:59 GMT Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2040.outbound.protection.outlook.com [104.47.73.40]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3ync9v4sck-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:43:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SaAyLgYSHuwYp5DkVuxDT9xBJjiChgdGTIkPUglvEMh90V9IUhJcYukzjmwfA1ZQu1OeGxzvQs8ZPBo4e54Yk5OqB5JP4JklIUvTz/kcI4fws8zeHAJs2Tdz2xhMZsyPWt4Y/GUtnFk5+8Xzd0ht8eJEerFfRlN/2QpUcOzWAhcY/LJTkMSj9MK5buez8j9XthiRJvnYkMG7MMaaftaFm8dcnXNhe67IOFfSkq8Q6CCvUkSXEMz+IGwDw09ITEGR3zMadikLXIS0h1+xWmUZIfYgATz7/ENnUbeL3nIcKCSjQEWTcmOQUrbIZDrI76E2m3pCeK0C3a4HACJ9LtJQIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=TbanG9OOchDY9/z2mDu4fqQnev6OFwdKu+Z6IOGBWmw=; b=ntqVOn09St3essbzi/Soxpn+Wc45TENCokWQU+r6Qn0Yc/Z29RWz6rClf0wD8h0TrB7l+X7Vr5hXJTDH2fvsU3wiwCeofSNMRvvIU68iuWtdSRMjkz/WAsJS0PYnyTRAV8lKgoa+eltOhuOthxSuscMddZAUHtH9l3vkPlPopOjmNOlOg7jM0Rkz2QmS3ykNnxlLfjutnKYsfnOTxjoMk6crfOvgGPYXIA5NT07/1QK3O2CxLSt4fT5Ak+bMIbsdzhsLonx9DYDICrz5uKh5v+7ilEZkMOVUUTKpsEtO7BxTv0rGXQSFk4R00McO10VhKj5lbt0pS1SzXbUh3/2IuQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TbanG9OOchDY9/z2mDu4fqQnev6OFwdKu+Z6IOGBWmw=; b=gORaHUzQU/DNUE+ieKs/4w57DNpK7/5ocK/M/NLWdixeTzwt5eIcbztLS8qTUnhfzOMPSFoElL+EEtS9TjZF+DVlon4L871y1O0/93H8S7MPKaQRzcN3QQSEK+parl1rZEo4LuUhQ+Unz/Ocv/0HAWLTcLf+eY6T5p6n/Zdq38o= Received: from DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) by SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7633.31; Mon, 10 Jun 2024 10:43:56 +0000 Received: from DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088]) by DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088%6]) with mapi id 15.20.7633.036; Mon, 10 Jun 2024 10:43:56 +0000 From: John Garry To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz Cc: djwong@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ojaswin@linux.ibm.com, linux-aio@kvack.org, linux-btrfs@vger.kernel.org, io-uring@vger.kernel.org, nilay@linux.ibm.com, ritesh.list@gmail.com, willy@infradead.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, hare@suse.de, Prasad Singamsetty , John Garry Subject: [PATCH v8 04/10] fs: Add initial atomic write support info to statx Date: Mon, 10 Jun 2024 10:43:23 +0000 Message-Id: <20240610104329.3555488-5-john.g.garry@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240610104329.3555488-1-john.g.garry@oracle.com> References: <20240610104329.3555488-1-john.g.garry@oracle.com> X-ClientProxiedBy: MN2PR05CA0021.namprd05.prod.outlook.com (2603:10b6:208:c0::34) To DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR10MB4313:EE_|SJ0PR10MB5613:EE_ X-MS-Office365-Filtering-Correlation-Id: 327340ad-2578-4670-26f1-08dc893a3a8d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230031|7416005|376005|1800799015|366007|921011; X-Microsoft-Antispam-Message-Info: udMSq2L4Spl8FLgcn48LMRU/mwpbC6G7I5EXjciq3Y7WTw42EhshikE+ecKtE4hvPJ8CnVX9XYGbMHF0Ylal3t+bN+EUDz7DxswlUjrB7Ev6Z2CO+oe1FqljSSI/gfW6xBSkyO7z49uYLzX7ZB2Rh1VkGw5u1ug8ovbu3Q6PW+UqkUa5Ie4/0s+EOp+Tgrali10XvHptAZ1fzSK15pEzN9fx4tXKdgID7BluSxzijrbA2vg7NQ4b2jTT3vSXPcaroBaygxvVqePFtujdlSbac+fg7j6o8BHURrQ6d+CvLQun7mgUfGDnkL9b3t8McSV9EIzOG5ePptFm+jD7ICD/aZ0+AQkTi1DJTGuom7tZqH0BT0tfW1Vcq3AxGpMOgaqWrt62j2TuT9sYC7NM9oOTTBnYcrv9YIv/6Nici5I+lFS/2CDBMZ+A064YD0Dm34Z8yHRpukOCCyFgu9GdspaKeL8rQp6keCuuusD8EDEZq02pnLYK5xmuvgexK6pTZ5KVGFOpVYL78l6OhT+/vqqr+4GtTonul4KlEjwymG4HDYKptL8mgKEpxr1lz7ZEJX5xe6nM7g+eywhsmwe5jFN8IXuJryPBOcAqMQF/+MAD4SSjlurx1XgAc9aYWIsJDDAUhOYbFarIdpXjxUBkhQZIBsdPKgjd6NViKJMEaxhJRUYjMqB2iP2wL5o0cxRdSXKbRzqcQJR8hBRNY+0/pJAJldzoVI53Byjs+RlPg8gK1VV3GY1lpYTwCh4q9okiPjH+JEZ0W4IUJE4F+Yz93ap0VT0W14/8pYh7Bg1jNOXloe/iBvjC0ULttGQVXH5rAyIBhZGBIRhdg0YtkadtTOC313m6rq4+6iXH+M7LJSMB5xMj/rO1LSmTFoklsPhpEdwzRt+cEPRsZeL5tJzxgygVv1gygICNGkxtxyGg/NIFfOBw7VaLDyx+uUSbADkHHrCrVShPtThLwy1+A2RmF4L4jXmbEL7b0TzQ/D9rtnmiOJSw06cxCHDp4B3UjSMwLDc/Kqy4tBO/zmyOxwrftJojvcXfV2cveqTLIY2gLfIS9KY3Rx+X5vJ1450WWEwaxKuc+C6gKlZiUZkcYXrVss8xcKqR5CtTV4M5ej8EdhhSmAJYlougQbudRidNQDWtSfqVxzTkHExsqQyQdg1N4XNGzpsQEMb7qVoOBBfzjPJbTwyFzvA+bmBxQV1FQDmgPyD3ZzRWXmht62w+u3nQCH7nBH0NSRUOTZSW3D5tGyvTiIOALEAhF87FCO+M2VBxDmZQEmWKNmd3kO6/DwB0nkwOg0SC0GLz94fRc/lHdeVX4O6pMyXMSJ8PYYjKlCDjeY1QHytKvD8ol8NRO4iiVJ4wENoGkXzZppX/3p9eC7R5UUc= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR10MB4313.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(7416005)(376005)(1800799015)(366007)(921011); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: RL6yRYF6O+qBEyJi2Oox0sDfNeFga5mSK3mIa1IDtGhBNYlUHef/H7oRTWIMGAjPKu0DUI3Dyhj0XeQgFNglekE3ydqfzZQGasC9A0lH9eH+6jUHafiB5Pe9kGnqF8z0c9bK3o3WoqfEHyru+wkBcWXuMw42xlTXL3b/Lwrxv8JVVeZbaKmiNjFQLDbHkqDouOErMd+a7kkRL/osmeG3IMuv3jyWFlwHOxO8CuRQfiVVmSARo3w6fbzCu9lKRXkLWODIdyZh80T7q1WQRTSLXVOR5w+a5FogQOI/TTh6PcyiIfMO3o+2PYEQneQdHJ3yxioynb69H/2kjyH6UCTUnJ6zo7wxAiiG2kNc+eeOUAPfBI61nGrKv9NTJFuq6v1tmvjiWDIlIcbsNOtrhJGo3HXWcp9PApu3mMTi85MKCbHsk0iYYwaOyiGfdVPwxt7oB4BvvTleCFFJE4fWKyM69l40T6WohyNu/Iuf0bEJsTnnelbm3dMsm7WMS7S9NWSpW/gWPE98q2SCCVtReXdHy08+3nilIoFLXoQkpkuiJh2o6yUlVkSfx8Yg9+oaSLE1TWd6Lc+HYN4J/DQd7akzi6KO2IF2fWIyLXZoTnyTGSLdGZMWxmiARZH2mBT17dcL7ihAuvsgRpODRcaTGBHkFPSkCjFtLUdYbRUc9IBYc3l9i3ftcoTpbgS7PdG7Ws6xDUN07Re5Ml2WRDRxqD2ZNRJEizwvjAiBmKENf9j7QTAOlL14qlJ6I3qSgXjikuRGhjvZtIUxlX3NRCvkWtUbqy9nY0dS8cf4nht9w1h48kd40WliFw2UmxTqJ4eCHFEcA7D/H/EQ7uQ5NNZQmEgb+QUdTYEXv35M2bRASwFXLwSRYBuSZ+3sIeKW6uDkCFhm/UqIBz6d71YlDU1tMgvYn/vIvBXyhmuZBVZo7TfKS743g2UfyNI/IlBHmGZI+TQpjNqCjokItpFi10obbrpqp+gzjWQz46T8Pj0Y3DoyFHjz5Lrq9k1RK/nodbLgba5vGyurgHtY+8jkj9ImAbMqOiwrXYL8BzM1YJMxs4CMPIUEjeGJbmuPBe3D92F7L7m50ecs0gGoxWCa8raNNEUNc7fs1/HtWnck1eusZqy2xd2FrswBeMYkjvMjlPcrC+tXHNV9jHYXR0Q0K8bHGn0GRrGi+REJwtU5tAeWhvhBJd4OEb0jAf5pqPnnhBRY8aDFvkK99JgXke/b+w+uoKES6CqfXJSTeXgBzXYgwBHMfJaZp4PzSCtCEuWfDn66bOyZ10nZIy123yM6oGJYYEM6mcF5OJewwCiu3LymwVHZLPdNd1iSb0WgQNXnouUuAuZu0YaeOJUNqsSJqVuao0DX1KlFn0CF3rg0Pf0ySKDcxRub8g7KMFx7GUF5qw7o6zorADea7wk2J27T3p/665q4f/Wv0aqJL39er2JYQc7rmowdptTTsHs3xuOJORzjkBAEJAjCagnHPsf+2yFLX5Jqik6XKf1QO85+oM+ugJTIFBNrWVY96T4vLwYUHdNbQ1SahUeiwa61Klao7k5Xme1za4JTprU2tvcunVf/heU7qhcGC9NI2icUA35uudUFk/DiQ8B3tfpA/iazrXwQstZpVg== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: teLK+IZRo8KTFXeimExTqs5v5Qh+xK6VoESFpiZ22k0p+s4B9xnqsNyNT3xzMi5l/tjQ/ri42o/2BRJw0wqJEMQUVsv1TwmENxxaXYqaZY1vatvVdc2/vRm2C0DUU33dB0ZT83ErkZTRc7uY7t5pyQKmevZsXG/kuBDT1Awxixsi6oXMXgZAHC/jUgWzHbWuUso/CZsSDsGBmoXtXBF+MgSXbxL3aROuJmGRy7NWieH0RpZWvcJox+yDfHhT22VRLSgzLt260nIppYhEWXyU0zay2qeyRhAz5agEUu+gYT3evCU+4ghoTcOQMPZHEiPI96YPQRq5P3XZ2v4Nfx8NqXs1lRaOF5LBm01GXNraPGyz0Xrc9g93Uy/CbRVIh/pkbhQj8X9FiaS4m/hRU1ajUfVdJLDLjvkGz9eG3WRO9ySSkCqbFS7sz/gtkiE6CPQO4ENd937mJDPxJ6M7tpVr2u7o3+YvnarFye+1esliV61J0n2nxbS3VhgKm5o07mQ/gfe2QD2PThH0dYjMG++SN3GB6t1sZ/8Wn0pujHyEikNSWELLSOEBjbpuMh5i96Aem0OpW4B09fA44JckdJ9PcJ1HBKszFHLPRZmuaUNR5d0= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 327340ad-2578-4670-26f1-08dc893a3a8d X-MS-Exchange-CrossTenant-AuthSource: DM6PR10MB4313.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jun 2024 10:43:56.4028 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: MPjDJFY6fKO/kVjvju266VHeqxxCTppwBNf80uBfh+znxx+ajV+BtNUpwJK+BWzHYIjb+EwSmEpxjmlGlupfzA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB5613 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-10_02,2024-06-10_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 adultscore=0 phishscore=0 suspectscore=0 malwarescore=0 mlxscore=0 spamscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406100081 X-Proofpoint-ORIG-GUID: iPsZLLU9F9r-Pjjtoe1J9zvg5zuBshN_ X-Proofpoint-GUID: iPsZLLU9F9r-Pjjtoe1J9zvg5zuBshN_ From: Prasad Singamsetty Extend statx system call to return additional info for atomic write support support for a file. Helper function generic_fill_statx_atomic_writes() can be used by FSes to fill in the relevant statx fields. For now atomic_write_segments_max will always be 1, otherwise some rules would need to be imposed on iovec length and alignment, which we don't want now. Signed-off-by: Prasad Singamsetty jpg: relocate bdev support to another patch Signed-off-by: John Garry --- fs/stat.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/fs.h | 3 +++ include/linux/stat.h | 3 +++ include/uapi/linux/stat.h | 12 ++++++++++-- 4 files changed, 50 insertions(+), 2 deletions(-) diff --git a/fs/stat.c b/fs/stat.c index 70bd3e888cfa..72d0e6357b91 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -89,6 +89,37 @@ void generic_fill_statx_attr(struct inode *inode, struct kstat *stat) } EXPORT_SYMBOL(generic_fill_statx_attr); +/** + * generic_fill_statx_atomic_writes - Fill in atomic writes statx attributes + * @stat: Where to fill in the attribute flags + * @unit_min: Minimum supported atomic write length in bytes + * @unit_max: Maximum supported atomic write length in bytes + * + * Fill in the STATX{_ATTR}_WRITE_ATOMIC flags in the kstat structure from + * atomic write unit_min and unit_max values. + */ +void generic_fill_statx_atomic_writes(struct kstat *stat, + unsigned int unit_min, + unsigned int unit_max) +{ + /* Confirm that the request type is known */ + stat->result_mask |= STATX_WRITE_ATOMIC; + + /* Confirm that the file attribute type is known */ + stat->attributes_mask |= STATX_ATTR_WRITE_ATOMIC; + + if (unit_min) { + stat->atomic_write_unit_min = unit_min; + stat->atomic_write_unit_max = unit_max; + /* Initially only allow 1x segment */ + stat->atomic_write_segments_max = 1; + + /* Confirm atomic writes are actually supported */ + stat->attributes |= STATX_ATTR_WRITE_ATOMIC; + } +} +EXPORT_SYMBOL_GPL(generic_fill_statx_atomic_writes); + /** * vfs_getattr_nosec - getattr without security checks * @path: file to get attributes from @@ -659,6 +690,9 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) tmp.stx_dio_mem_align = stat->dio_mem_align; tmp.stx_dio_offset_align = stat->dio_offset_align; tmp.stx_subvol = stat->subvol; + tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min; + tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max; + tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max; return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; } diff --git a/include/linux/fs.h b/include/linux/fs.h index e049414bef7d..db26b4a70c62 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3235,6 +3235,9 @@ extern const struct inode_operations page_symlink_inode_operations; extern void kfree_link(void *); void generic_fillattr(struct mnt_idmap *, u32, struct inode *, struct kstat *); void generic_fill_statx_attr(struct inode *inode, struct kstat *stat); +void generic_fill_statx_atomic_writes(struct kstat *stat, + unsigned int unit_min, + unsigned int unit_max); extern int vfs_getattr_nosec(const struct path *, struct kstat *, u32, unsigned int); extern int vfs_getattr(const struct path *, struct kstat *, u32, unsigned int); void __inode_add_bytes(struct inode *inode, loff_t bytes); diff --git a/include/linux/stat.h b/include/linux/stat.h index bf92441dbad2..3d900c86981c 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -54,6 +54,9 @@ struct kstat { u32 dio_offset_align; u64 change_cookie; u64 subvol; + u32 atomic_write_unit_min; + u32 atomic_write_unit_max; + u32 atomic_write_segments_max; }; /* These definitions are internal to the kernel for now. Mainly used by nfsd. */ diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h index 67626d535316..887a25286441 100644 --- a/include/uapi/linux/stat.h +++ b/include/uapi/linux/stat.h @@ -126,9 +126,15 @@ struct statx { __u64 stx_mnt_id; __u32 stx_dio_mem_align; /* Memory buffer alignment for direct I/O */ __u32 stx_dio_offset_align; /* File offset alignment for direct I/O */ - __u64 stx_subvol; /* Subvolume identifier */ /* 0xa0 */ - __u64 __spare3[11]; /* Spare space for future expansion */ + __u64 stx_subvol; /* Subvolume identifier */ + __u32 stx_atomic_write_unit_min; /* Min atomic write unit in bytes */ + __u32 stx_atomic_write_unit_max; /* Max atomic write unit in bytes */ + /* 0xb0 */ + __u32 stx_atomic_write_segments_max; /* Max atomic write segment count */ + __u32 __spare1[1]; + /* 0xb8 */ + __u64 __spare3[9]; /* Spare space for future expansion */ /* 0x100 */ }; @@ -157,6 +163,7 @@ struct statx { #define STATX_DIOALIGN 0x00002000U /* Want/got direct I/O alignment info */ #define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */ #define STATX_SUBVOL 0x00008000U /* Want/got stx_subvol */ +#define STATX_WRITE_ATOMIC 0x00010000U /* Want/got atomic_write_* fields */ #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */ @@ -192,6 +199,7 @@ struct statx { #define STATX_ATTR_MOUNT_ROOT 0x00002000 /* Root of a mount */ #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */ #define STATX_ATTR_DAX 0x00200000 /* File is currently in DAX state */ +#define STATX_ATTR_WRITE_ATOMIC 0x00400000 /* File supports atomic write operations */ #endif /* _UAPI_LINUX_STAT_H */ From patchwork Mon Jun 10 10:43:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 804095 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5E5C74050; Mon, 10 Jun 2024 10:45:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.177.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016350; cv=fail; b=E/ehiFgCAH08o9cQ5ELjF+0OQVF1SFtmpiPqc4AqzXwrJ+p9MIzxfNlsnACwneIaVv15CXCv9qnLNJ1PsU4Kg6kVox2VhYv3kgf4sMqqLzogC5WnIURkJVC1/rsZyBH7tNDzF9MmdOyeow7fOia5IOgdUJTx1GLRxsT62k+BTMI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016350; c=relaxed/simple; bh=oAIBUPpDmv4JGETOqU/J3N8jaTg97+qKD8BZM4pEZdM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=iqksWifKzLEvSL2yk+Voo6mKVYf9R31J14B/obABx5GYQN9t7azH/g8YntYhF+y5d7ZS6gzE4Qw/fnQmZKb56es9GYwaUq2+QyKaHMVpDqueQli/C8D4cvGgvYqsp9Q9XfwX5+jw0HQpvbpNtHgSVgvT2fOtMt9mBV0YErLnjMA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=k7O73Ud5; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=ya3EEuRy; arc=fail smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="k7O73Ud5"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="ya3EEuRy" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45A4Bqiq025709; Mon, 10 Jun 2024 10:44:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-transfer-encoding:content-type:mime-version; s= corp-2023-11-20; bh=uHVeKuD2AEcCqZ/LjV+OOc1QBnsCnYX2OlH3aXiQ9Bc=; b= k7O73Ud5D5UymaRohv7hH33qchVmM6XMbJwRMW6Yp3Si1VaRpOEY47GkLuPkar6N Ku/btK31DN8kvCh0833FcwSvxvfIb4dhjYY8BDl2bMZrb9qtoplXUuVV24Ql9vzF 2jq9zcSJpDyExLihNY0VoU1r9CtrFQ/gcRiNA2PfcX8pjY8XKVyFqCriMltOgZl4 fwJg0cEEnNcuplWcNpyFGg4ReuLgAIx8YBHnoPL2BUGm9JYAv1TDTlxNz5HBE52R O3k3TXf4OWaIGv1WqThyDIv8ctTxI0ADyk1y2yhci2EQ3M0gLkSVFEsqghdT0mDY n1mm0I3FV+DpqJkrfTF9RQ== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3ymh1gaagk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:44:01 +0000 (GMT) Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45A9uGuT036740; Mon, 10 Jun 2024 10:44:01 GMT Received: from nam04-dm6-obe.outbound.protection.outlook.com (mail-dm6nam04lp2040.outbound.protection.outlook.com [104.47.73.40]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3yncdukche-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:44:00 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=E/NtkekJVgygCkHqp4AWHbhg6w5dulguFIgxuxhDTheNkkiXqbgk1sD6QBBcH7eMcyAiLweQUd7Hor22P5b2B3AP5ggxIYQ/T22qMFR2i4zzzAYjrmCO8nOVNZByMUfyKjT//aJJk8qNEPd3yAN0/Pd2LtaeJ9KLDtZJVN6DVEi1wgPIozMpc8y3KH6ezhfKy2QBiZiJhEmkDXzT2aNBb1dmehbBvpcjkDk7MieYLwgZ6b5adiDvgXjbat0TRK1Bct81GuA8xVdHyya7fzuuJpMStZKmSiqx3MShbGVe96m3o/3zMJfh5WUUky2Y42ccOnRa9nRYg0Zu5bwR2a4q8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uHVeKuD2AEcCqZ/LjV+OOc1QBnsCnYX2OlH3aXiQ9Bc=; b=YldVa0ymUWbbeKxGOlEZJ44eKlW3hoYIuA/U/CQ1bBE/jwsZKUrCn5scpMst4DCLrzH/uJuzbUocIXx2Gu5kq0NH3B2n/ddqQEoahwwwfpY0RrTSu65Yffc3WwY5gxEGlB+6fEpbznhOnmYO2FV8nzEfsHvyF+pmaLf/pOf2vDEhrSag/MjrpPa6C6kIEkGViHd3eeIfNLN1SrFPntgmmI4qczf2B2m1/hrozre1TgNkjTd6x8m0FSG7NpPhjve79edj4gCBDbqxiwvvfyQyAOScWe8psPI6lnAS2vMlyizg2OHtzPMqP+0Bsv21WaPB78z+Y+x6w0+SFLcrqnklGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=uHVeKuD2AEcCqZ/LjV+OOc1QBnsCnYX2OlH3aXiQ9Bc=; b=ya3EEuRyK43c0v1NocjcOPGjb3Ppsr9iihjwKjL/CLNe5r7XZLSRJrDQhRNyy/5nP/7xcSwJG3qypcPY94A/t1NbGJ411IWcVJupAEk/8FvmSHdABlEK1yru9hCYLAiiDFIuaFdUFRqmUsVxyVAjgvXKUz+ZTjrBGLy/LW5DD6I= Received: from DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) by SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7633.31; Mon, 10 Jun 2024 10:43:58 +0000 Received: from DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088]) by DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088%6]) with mapi id 15.20.7633.036; Mon, 10 Jun 2024 10:43:58 +0000 From: John Garry To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz Cc: djwong@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ojaswin@linux.ibm.com, linux-aio@kvack.org, linux-btrfs@vger.kernel.org, io-uring@vger.kernel.org, nilay@linux.ibm.com, ritesh.list@gmail.com, willy@infradead.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, hare@suse.de, John Garry , Himanshu Madhani Subject: [PATCH v8 05/10] block: Add core atomic write support Date: Mon, 10 Jun 2024 10:43:24 +0000 Message-Id: <20240610104329.3555488-6-john.g.garry@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240610104329.3555488-1-john.g.garry@oracle.com> References: <20240610104329.3555488-1-john.g.garry@oracle.com> X-ClientProxiedBy: BL1PR13CA0260.namprd13.prod.outlook.com (2603:10b6:208:2ba::25) To DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR10MB4313:EE_|SJ0PR10MB5613:EE_ X-MS-Office365-Filtering-Correlation-Id: bd5ef50e-30ad-4032-b292-08dc893a3b7d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230031|7416005|376005|1800799015|366007|921011; X-Microsoft-Antispam-Message-Info: TZvYyjQSHEgxC51mVeIgYTJ/YAeXuoPjNbK6Q4X6hZfC9fufQIJS+w3afkRTCdClxAVBVwTLNVj0Fg0tpWuIGzIexUVyTsovXr75dC0/1XedpzZEGpEmFsvNSWnSnRt1HT5p/r9r3uNyofsyS7X4jVXZxFaS+gfGQBZtVsg7iXy8GtcGYgioOcGHxV7SLh3JopJH2ri/kTFqNEWL8HfAoKL0YzThJPdT5LfXrQFrgAB0pN/V3pF6OvevBNG/SrlKbiM6YdSY2n3U1Ul/+f2wfTauibo53VilvQcWqFq1NNbV2Yu2PC1dY843OLhVjhzo87JHfdrHtNo4wzqlZqY9YiPf4q3SuIYSQj6gt0zykS8Puh7QIEOzqqmD6zkZI/E8OCv4C/lNobifAE75x2IqlkTg1c66K3qbfLQutrHeV5h4JU5XP8xtJjmOGGemTbwPcFHMNljs6jBEy2RfuMQBy8LxAHxPaROirjLjZJTq1daOLtj+9sfk5I3JLnv3kybzyZT2xVntDa48yyBh7hICE6WCOtCIcfGgFev7cfKBf8COZahJW8lRuMOCv1CJk1XOh9fX4qh4ddi99DKcfnbUXpw5r8tepUfTh+nvDFJwEASYifmnLMKfozRuBTPCeQF5Cx9uDQw5ahaurhJO75NZ3rrpapU5CT6kX+NDplD7GNfFiT3coocFBL4bewa65NOnVApzcu30wguv3ZuGww3SmvjH/PMhfHsqB3GxkymmnPgOlJckN3s0p77XGdlf/5xcH05GxEo4dPtOgXUJNj+dbTL/k6gg8sbMYW2yZpneqrtYcw5Hu/0Cmyt7up3UD+mWPPUPLRZrIr2tvUjN2VNb5QPPDUwJ/Hz3IkKEF4uzot5MkvuMzL4wR76qTRYwsiyacI8Uq1x/pI9GNu2wXkCGMf27IYD0Nf31k9iAzdNBPNWy+ObcxgOdsuekJnIrFxw605dGeF6RgbUsUcN+toK05cIa6tjF46UIyfQPj9onjyG0YN5DbNRigB1BF/6OkZPvSYWjrO10N00OXVIgHVgKCeUvzeBc/LKxmUFwy/TJpf23/0+LODd9oCM5eJsLCRgsWiIj+4DI26qChWQ87e4um2MWj+l/OmaZwr7ZDZgMeKEbXVxu+2GJo7uF0ie2IQL/IqMYNRMMmr+V4C6Ev79L2f5rzgMaVs1wQ40YsnVfTqDootww4q9K+IC+vMdWQ6NoriAHVNvaSzH7AESA6WmXwMVAWKflZVdgx8gy4k/WkWO7ZFEDxou3NQZDNAydg+nkRzbc8tyQri8dXxGFY7KOc13DfRjW1mJOWhYKmOAdShps4KZXo9jT9zx6cO57INYiYy6V9NZ4NQO1xlBgjKwAZ0XW72HFiPIhdcAehn7+7mk= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR10MB4313.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(7416005)(376005)(1800799015)(366007)(921011); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YJSry7v0xGVqisuwbI48iSbswyMs77XSdr+B9ysLEjImSOKxHYYdrx6+2XCKJoXEHmKGWnqgnIhYejuVSU8GT3wrUfMZEa86LzrYbSo+59Wjufjo3P3DkHxhx93qeAGwlZaYyFiUHMtH/G1s2rJO+ZAuXXMtjwDZpuMr2xey/hVuDT6+T5dsBrfmj/M6itKlzGXmIK1wJhaGxqi7JcUMaEYDnttyz3zJqDBCNGf+p/sUYSJYcg4LW7muim2e8OqXDEUJtvUgI9nrcFHVgpwD+ocnhQ2j7CMXPFZiouOPqo62G212u4Mjeai/pxLUtdQGD8Mu6zYk1n94YY79u60MdwOjn7WMQvrHmorRIZfomFarZxIC/Yl0ar8aeM9REnQeSYxjwoq7BUx0WXQ62fSRt1RBJGS8sQosMokiSu3eMENV2Xpu1ncYxAX8akAYxNglb2J+vE3pXz2PYIorKUa57nj+ti1LKQvZC3iFw0IyCID+9M5w+amqRpTGg0Uf8Rt3zERbjr1Fq40E9M+6tneV12VGxEFGbpLNGqhpfpw13nIqrkWf0i4FERN78zoyJE03iWGH0H65jlXItRlJ23Su2AukCHJK/hCu3Foro0uqzczgT9zl3KMHNZfKcATDHtBOz2+lhor9DE8lqFiNlGNflZxGfJbq3bhaJHbc51ZW8ta7eBowDMDYt/tZht//uOk+oeH4rs610NR+ZZJiNo35cVcPO5qRI04uvG+5+L01PX91DqrohTMofiNkY4QrV3gLbos0o+27/6c9g2K0yMHpH3nivFvtFcbL7LzUF34xpZiFRHSc7JBWL9BdoVu3szwVm+ao36pfuYw6S1lq/0yNBHObRL1TpkGChOyUBFvjYdKpYLmfOJhaRqidLZmb5rIaI6I+Rz6UfqlCIhQNt8Y7icTAVEB3lgJ1aVjEuU3XgwQuLHnGlrygtl/Z9JTxybykNNu2vNV4uRaNr+/CJ1tEskDYIebIu4Hdz/7F1I+OZ9mW8ylh5YNJ7xgyqa5gKgnRZFwYM99jwZHsy2jpDQUnmkCeA/6veeKr4nlWZANswe+kbA2pfsIWvMPOoxaiQ7IeorZRw/tD+YLtfb0zwaaZgHBSmWBT+ZvnMofVZ9TZ95w9SWUcsmiPTzgFq7R9HADiRaYn/Nw0Yl43qDznv/8Nm8oDqDds5bpmvffFbsgeEEs4etYf6TTiWBNV+BB8WoCnvin4C+xCKiL4vokzsfKSg+6x6hberYcwevyCkhdiVLA9ackGn4COkI/Lou5/UTMNquIBaLg9e68SKP7uEZKPnlheDKBawxoQ0dR5iAL9JXxGdCPvS4llKd4ZenRskv6GdWH+mnTJee9t3RY5xKkwYBYC1rqtMofBLnla6tnmpzqVu9hqb0Qxo5CzSsrG5tIxjg237/g5gTTozv34AyOW+g24lfb7vGuqqji+H0q7shTti995y4AebsxdHWYtdvMJrkiGI/fgCMRMrlPxhqOd/kON+UJVIID2RZDSedfqUHopKl02io47EWouM9NlaOJYvSO6OJM8LdaGURzzvlzYu//OoeQpmvLVV1RRD/bo6jkhnpdgbC6QEa5PX1FX1HqY4R4nzD88SH8D4u6n4Oe1ng== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: YgeACnVvN0bf280Haw4oLWtrJUMGmwarO0WxRqbe0i0BGZREoBcbhDUnov8gRyUrwy0sVVoXdazFCam/oBR8N80qxKj+N90/pDGDheEG5hT/fsOdOMZ5muVUl6hRSfMCYTtON5XckOxUY/tvJBwIq/xI9mKnVQxif+dWhoIwix8CV0/bJAjTpuG+xYKTpqPXiXvx3wbmd2EngsN8icPDEdSgSCtt1b9+Jfn4G8dATucRQsN6+wKW2aZb4XYbrMzHzFwy7BqMvPu7zh/2WvoOzGdVMzQRlnTzK94NtskRa27oi7C9/dtdF5DfD4mXgZlNG3d6Wcgac77j81a9tXb65wDfvkDovSx7NJKWSofiAXoxmTENDp6MkumKnv7b/WS5hkBDCeARwRhSNITKXVZ1URuLHp/q5VV4rn6LQKTRKVTsZOLcADAEbH7B48NapRfj+zAGiURbX/DhbuwP+Grv21aeSF0hBZJWZSlpa4tUqHKnJywRB/wiqOktpFdgkYFOtlPfiCqIVmyJtu5X6fcCoo92zyja1Vg2XXP8d6PEZ7k4alDw5aOo36gUrmbmD+pG+niBBUkLT4No6Jz+k2aEjoQyxnpv3eerKOXF0Vxc+I0= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: bd5ef50e-30ad-4032-b292-08dc893a3b7d X-MS-Exchange-CrossTenant-AuthSource: DM6PR10MB4313.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jun 2024 10:43:58.0600 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: P342nQXQ8gY425pPQDfGV0B6cl6l5fX7e+CbGWlEnuv1uCGrlvBn9GN3ShsMNA75RFkiEEGLYWjZsgCZ5ZevcA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB5613 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-10_02,2024-06-10_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 phishscore=0 adultscore=0 mlxscore=0 bulkscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406100081 X-Proofpoint-ORIG-GUID: JFo0uzMO69HIN5WjqhJRck976ags3LPo X-Proofpoint-GUID: JFo0uzMO69HIN5WjqhJRck976ags3LPo Add atomic write support, as follows: - add helper functions to get request_queue atomic write limits - report request_queue atomic write support limits to sysfs and update Doc - support to safely merge atomic writes - deal with splitting atomic writes - misc helper functions - add a per-request atomic write flag New request_queue limits are added, as follows: - atomic_write_hw_max is set by the block driver and is the maximum length of an atomic write which the device may support. It is not necessarily a power-of-2. - atomic_write_max_sectors is derived from atomic_write_hw_max_sectors and max_hw_sectors. It is always a power-of-2. Atomic writes may be merged, and atomic_write_max_sectors would be the limit on a merged atomic write request size. This value is not capped at max_sectors, as the value in max_sectors can be controlled from userspace, and it would only cause trouble if userspace could limit atomic_write_unit_max_bytes and the other atomic write limits. - atomic_write_hw_unit_{min,max} are set by the block driver and are the min/max length of an atomic write unit which the device may support. They both must be a power-of-2. Typically atomic_write_hw_unit_max will hold the same value as atomic_write_hw_max. - atomic_write_unit_{min,max} are derived from atomic_write_hw_unit_{min,max}, max_hw_sectors, and block core limits. Both min and max values must be a power-of-2. - atomic_write_hw_boundary is set by the block driver. If non-zero, it indicates an LBA space boundary at which an atomic write straddles no longer is atomically executed by the disk. The value must be a power-of-2. Note that it would be acceptable to enforce a rule that atomic_write_hw_boundary_sectors is a multiple of atomic_write_hw_unit_max, but the resultant code would be more complicated. All atomic writes limits are by default set 0 to indicate no atomic write support. Even though it is assumed by Linux that a logical block can always be atomically written, we ignore this as it is not of particular interest. Stacked devices are just not supported either for now. An atomic write must always be submitted to the block driver as part of a single request. As such, only a single BIO must be submitted to the block layer for an atomic write. When a single atomic write BIO is submitted, it cannot be split. As such, atomic_write_unit_{max, min}_bytes are limited by the maximum guaranteed BIO size which will not be required to be split. This max size is calculated by request_queue max segments and the number of bvecs a BIO can fit, BIO_MAX_VECS. Currently we rely on userspace issuing a write with iovcnt=1 for pwritev2() - as such, we can rely on each segment containing PAGE_SIZE of data, apart from the first+last, which each can fit logical block size of data. The first+last will be LBS length/aligned as we rely on direct IO alignment rules also. New sysfs files are added to report the following atomic write limits: - atomic_write_unit_max_bytes - same as atomic_write_unit_max_sectors in bytes - atomic_write_unit_min_bytes - same as atomic_write_unit_min_sectors in bytes - atomic_write_boundary_bytes - same as atomic_write_hw_boundary_sectors in bytes - atomic_write_max_bytes - same as atomic_write_max_sectors in bytes Atomic writes may only be merged with other atomic writes and only under the following conditions: - total resultant request length <= atomic_write_max_bytes - the merged write does not straddle a boundary Helper function bdev_can_atomic_write() is added to indicate whether atomic writes may be issued to a bdev. If a bdev is a partition, the partition start must be aligned with both atomic_write_unit_min_sectors and atomic_write_hw_boundary_sectors. FSes will rely on the block layer to validate that an atomic write BIO submitted will be of valid size, so add blk_validate_atomic_write_op_size() for this purpose. Userspace expects an atomic write which is of invalid size to be rejected with -EINVAL, so add BLK_STS_INVAL for this. Also use BLK_STS_INVAL for when a BIO needs to be split, as this should mean an invalid size BIO. Flag REQ_ATOMIC is used for indicating an atomic write. Co-developed-by: Himanshu Madhani Signed-off-by: Himanshu Madhani Signed-off-by: John Garry --- Documentation/ABI/stable/sysfs-block | 53 ++++++++++++++++++++ block/blk-core.c | 19 +++++++ block/blk-merge.c | 50 +++++++++++++++++-- block/blk-settings.c | 75 ++++++++++++++++++++++++++++ block/blk-sysfs.c | 33 ++++++++++++ block/blk.h | 3 ++ include/linux/blk_types.h | 8 ++- include/linux/blkdev.h | 55 ++++++++++++++++++++ 8 files changed, 291 insertions(+), 5 deletions(-) diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block index 831f19a32e08..cea8856f798d 100644 --- a/Documentation/ABI/stable/sysfs-block +++ b/Documentation/ABI/stable/sysfs-block @@ -21,6 +21,59 @@ Description: device is offset from the internal allocation unit's natural alignment. +What: /sys/block//atomic_write_max_bytes +Date: February 2024 +Contact: Himanshu Madhani +Description: + [RO] This parameter specifies the maximum atomic write + size reported by the device. This parameter is relevant + for merging of writes, where a merged atomic write + operation must not exceed this number of bytes. + This parameter may be greater than the value in + atomic_write_unit_max_bytes as + atomic_write_unit_max_bytes will be rounded down to a + power-of-two and atomic_write_unit_max_bytes may also be + limited by some other queue limits, such as max_segments. + This parameter - along with atomic_write_unit_min_bytes + and atomic_write_unit_max_bytes - will not be larger than + max_hw_sectors_kb, but may be larger than max_sectors_kb. + + +What: /sys/block//atomic_write_unit_min_bytes +Date: February 2024 +Contact: Himanshu Madhani +Description: + [RO] This parameter specifies the smallest block which can + be written atomically with an atomic write operation. All + atomic write operations must begin at a + atomic_write_unit_min boundary and must be multiples of + atomic_write_unit_min. This value must be a power-of-two. + + +What: /sys/block//atomic_write_unit_max_bytes +Date: February 2024 +Contact: Himanshu Madhani +Description: + [RO] This parameter defines the largest block which can be + written atomically with an atomic write operation. This + value must be a multiple of atomic_write_unit_min and must + be a power-of-two. This value will not be larger than + atomic_write_max_bytes. + + +What: /sys/block//atomic_write_boundary_bytes +Date: February 2024 +Contact: Himanshu Madhani +Description: + [RO] A device may need to internally split an atomic write I/O + which straddles a given logical block address boundary. This + parameter specifies the size in bytes of the atomic boundary if + one is reported by the device. This value must be a + power-of-two and at least the size as in + atomic_write_unit_max_bytes. + Any attempt to merge atomic write I/Os must not result in a + merged I/O which crosses this boundary (if any). + What: /sys/block//diskseq Date: February 2021 diff --git a/block/blk-core.c b/block/blk-core.c index 82c3ae22d76d..d9f58fe71758 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -174,6 +174,8 @@ static const struct { /* Command duration limit device-side timeout */ [BLK_STS_DURATION_LIMIT] = { -ETIME, "duration limit exceeded" }, + [BLK_STS_INVAL] = { -EINVAL, "invalid" }, + /* everything else not covered above: */ [BLK_STS_IOERR] = { -EIO, "I/O" }, }; @@ -739,6 +741,18 @@ void submit_bio_noacct_nocheck(struct bio *bio) __submit_bio_noacct(bio); } +static blk_status_t blk_validate_atomic_write_op_size(struct request_queue *q, + struct bio *bio) +{ + if (bio->bi_iter.bi_size > queue_atomic_write_unit_max_bytes(q)) + return BLK_STS_INVAL; + + if (bio->bi_iter.bi_size % queue_atomic_write_unit_min_bytes(q)) + return BLK_STS_INVAL; + + return BLK_STS_OK; +} + /** * submit_bio_noacct - re-submit a bio to the block device layer for I/O * @bio: The bio describing the location in memory and on the device. @@ -797,6 +811,11 @@ void submit_bio_noacct(struct bio *bio) switch (bio_op(bio)) { case REQ_OP_READ: case REQ_OP_WRITE: + if (bio->bi_opf & REQ_ATOMIC) { + status = blk_validate_atomic_write_op_size(q, bio); + if (status != BLK_STS_OK) + goto end_io; + } break; case REQ_OP_FLUSH: /* diff --git a/block/blk-merge.c b/block/blk-merge.c index 68969e27c831..b158d31940d1 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -154,8 +154,16 @@ static struct bio *bio_split_write_zeroes(struct bio *bio, return bio_split(bio, lim->max_write_zeroes_sectors, GFP_NOIO, bs); } -static inline unsigned int blk_boundary_sectors(const struct queue_limits *lim) +static inline unsigned int blk_boundary_sectors(const struct queue_limits *lim, + bool is_atomic) { + /* + * If chunk_sectors and atomic_write_boundary_sectors are both set, + * then they must be equal. + */ + if (is_atomic) + return lim->atomic_write_boundary_sectors; + return lim->chunk_sectors; } @@ -172,8 +180,18 @@ static inline unsigned get_max_io_size(struct bio *bio, { unsigned pbs = lim->physical_block_size >> SECTOR_SHIFT; unsigned lbs = lim->logical_block_size >> SECTOR_SHIFT; - unsigned boundary_sectors = blk_boundary_sectors(lim); - unsigned max_sectors = lim->max_sectors, start, end; + bool is_atomic = bio->bi_opf & REQ_ATOMIC; + unsigned boundary_sectors = blk_boundary_sectors(lim, is_atomic); + unsigned max_sectors, start, end; + + /* + * We ignore lim->max_sectors for atomic writes because it may less + * than the actual bio size, which we cannot tolerate. + */ + if (is_atomic) + max_sectors = lim->atomic_write_max_sectors; + else + max_sectors = lim->max_sectors; if (boundary_sectors) { max_sectors = min(max_sectors, @@ -311,6 +329,11 @@ struct bio *bio_split_rw(struct bio *bio, const struct queue_limits *lim, *segs = nsegs; return NULL; split: + if (bio->bi_opf & REQ_ATOMIC) { + bio->bi_status = BLK_STS_INVAL; + bio_endio(bio); + return ERR_PTR(-EINVAL); + } /* * We can't sanely support splitting for a REQ_NOWAIT bio. End it * with EAGAIN if splitting is required and return an error pointer. @@ -596,11 +619,12 @@ static inline unsigned int blk_rq_get_max_sectors(struct request *rq, struct request_queue *q = rq->q; struct queue_limits *lim = &q->limits; unsigned int max_sectors, boundary_sectors; + bool is_atomic = rq->cmd_flags & REQ_ATOMIC; if (blk_rq_is_passthrough(rq)) return q->limits.max_hw_sectors; - boundary_sectors = blk_boundary_sectors(lim); + boundary_sectors = blk_boundary_sectors(lim, is_atomic); max_sectors = blk_queue_get_max_sectors(rq); if (!boundary_sectors || @@ -806,6 +830,18 @@ static enum elv_merge blk_try_req_merge(struct request *req, return ELEVATOR_NO_MERGE; } +static bool blk_atomic_write_mergeable_rq_bio(struct request *rq, + struct bio *bio) +{ + return (rq->cmd_flags & REQ_ATOMIC) == (bio->bi_opf & REQ_ATOMIC); +} + +static bool blk_atomic_write_mergeable_rqs(struct request *rq, + struct request *next) +{ + return (rq->cmd_flags & REQ_ATOMIC) == (next->cmd_flags & REQ_ATOMIC); +} + /* * For non-mq, this has to be called with the request spinlock acquired. * For mq with scheduling, the appropriate queue wide lock should be held. @@ -829,6 +865,9 @@ static struct request *attempt_merge(struct request_queue *q, if (req->ioprio != next->ioprio) return NULL; + if (!blk_atomic_write_mergeable_rqs(req, next)) + return NULL; + /* * If we are allowed to merge, then append bio list * from next to rq and release next. merge_requests_fn @@ -960,6 +999,9 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio) if (rq->ioprio != bio_prio(bio)) return false; + if (blk_atomic_write_mergeable_rq_bio(rq, bio) == false) + return false; + return true; } diff --git a/block/blk-settings.c b/block/blk-settings.c index 996f247fc98e..140e13616462 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -97,6 +97,79 @@ static int blk_validate_zoned_limits(struct queue_limits *lim) return 0; } +/* + * Returns max guaranteed bytes which we can fit in a bio. + * + * We request that an atomic_write is ITER_UBUF iov_iter (so a single vector), + * so we assume that we can fit in at least PAGE_SIZE in a segment, apart from + * the first and last segments. + */ +static +unsigned int blk_queue_max_guaranteed_bio(struct queue_limits *lim) +{ + unsigned int max_segments = min(BIO_MAX_VECS, lim->max_segments); + unsigned int length; + + length = min(max_segments, 2) * lim->logical_block_size; + if (max_segments > 2) + length += (max_segments - 2) * PAGE_SIZE; + + return length; +} + +static void blk_atomic_writes_update_limits(struct queue_limits *lim) +{ + unsigned int unit_limit = min(lim->max_hw_sectors << SECTOR_SHIFT, + blk_queue_max_guaranteed_bio(lim)); + + unit_limit = rounddown_pow_of_two(unit_limit); + + lim->atomic_write_max_sectors = + min(lim->atomic_write_hw_max >> SECTOR_SHIFT, + lim->max_hw_sectors); + lim->atomic_write_unit_min = + min(lim->atomic_write_hw_unit_min, unit_limit); + lim->atomic_write_unit_max = + min(lim->atomic_write_hw_unit_max, unit_limit); + lim->atomic_write_boundary_sectors = + lim->atomic_write_hw_boundary >> SECTOR_SHIFT; +} + +static void blk_validate_atomic_write_limits(struct queue_limits *lim) +{ + unsigned int boundary_sectors_hw; + + if (!lim->atomic_write_hw_max) + goto unsupported; + + boundary_sectors_hw = lim->atomic_write_hw_boundary >> SECTOR_SHIFT; + + if (boundary_sectors_hw) { + /* It doesn't make sense to allow different non-zero values */ + if (lim->chunk_sectors && + lim->chunk_sectors != boundary_sectors_hw) + goto unsupported; + + /* The boundary size just needs to be a multiple of unit_max + * (and not necessarily a power-of-2), so this following check + * could be relaxed in future. + * Furthermore, if needed, unit_max could be reduced so that + * it is compliant with a !power-of-2 boundary. + */ + if (!is_power_of_2(lim->atomic_write_hw_boundary)) + goto unsupported; + } + + blk_atomic_writes_update_limits(lim); + return; + +unsupported: + lim->atomic_write_max_sectors = 0; + lim->atomic_write_boundary_sectors = 0; + lim->atomic_write_unit_min = 0; + lim->atomic_write_unit_max = 0; +} + /* * Check that the limits in lim are valid, initialize defaults for unset * values, and cap values based on others where needed. @@ -230,6 +303,8 @@ static int blk_validate_limits(struct queue_limits *lim) lim->misaligned = 0; } + blk_validate_atomic_write_limits(lim); + return blk_validate_zoned_limits(lim); } diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index f0f9314ab65c..42fbbaa52ccf 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -118,6 +118,30 @@ static ssize_t queue_max_discard_segments_show(struct request_queue *q, return queue_var_show(queue_max_discard_segments(q), page); } +static ssize_t queue_atomic_write_max_bytes_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_atomic_write_max_bytes(q), page); +} + +static ssize_t queue_atomic_write_boundary_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_atomic_write_boundary_bytes(q), page); +} + +static ssize_t queue_atomic_write_unit_min_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_atomic_write_unit_min_bytes(q), page); +} + +static ssize_t queue_atomic_write_unit_max_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_atomic_write_unit_max_bytes(q), page); +} + static ssize_t queue_max_integrity_segments_show(struct request_queue *q, char *page) { return queue_var_show(q->limits.max_integrity_segments, page); @@ -495,6 +519,11 @@ QUEUE_RO_ENTRY(queue_discard_max_hw, "discard_max_hw_bytes"); QUEUE_RW_ENTRY(queue_discard_max, "discard_max_bytes"); QUEUE_RO_ENTRY(queue_discard_zeroes_data, "discard_zeroes_data"); +QUEUE_RO_ENTRY(queue_atomic_write_max_bytes, "atomic_write_max_bytes"); +QUEUE_RO_ENTRY(queue_atomic_write_boundary, "atomic_write_boundary_bytes"); +QUEUE_RO_ENTRY(queue_atomic_write_unit_max, "atomic_write_unit_max_bytes"); +QUEUE_RO_ENTRY(queue_atomic_write_unit_min, "atomic_write_unit_min_bytes"); + QUEUE_RO_ENTRY(queue_write_same_max, "write_same_max_bytes"); QUEUE_RO_ENTRY(queue_write_zeroes_max, "write_zeroes_max_bytes"); QUEUE_RO_ENTRY(queue_zone_append_max, "zone_append_max_bytes"); @@ -618,6 +647,10 @@ static struct attribute *queue_attrs[] = { &queue_discard_max_entry.attr, &queue_discard_max_hw_entry.attr, &queue_discard_zeroes_data_entry.attr, + &queue_atomic_write_max_bytes_entry.attr, + &queue_atomic_write_boundary_entry.attr, + &queue_atomic_write_unit_min_entry.attr, + &queue_atomic_write_unit_max_entry.attr, &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, diff --git a/block/blk.h b/block/blk.h index 75c1683fc320..b2fa42657f62 100644 --- a/block/blk.h +++ b/block/blk.h @@ -193,6 +193,9 @@ static inline unsigned int blk_queue_get_max_sectors(struct request *rq) if (unlikely(op == REQ_OP_WRITE_ZEROES)) return q->limits.max_write_zeroes_sectors; + if (rq->cmd_flags & REQ_ATOMIC) + return q->limits.atomic_write_max_sectors; + return q->limits.max_sectors; } diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 781c4500491b..632edd71f8c6 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -162,6 +162,11 @@ typedef u16 blk_short_t; */ #define BLK_STS_DURATION_LIMIT ((__force blk_status_t)17) +/* + * Invalid size or alignment. + */ +#define BLK_STS_INVAL ((__force blk_status_t)19) + /** * blk_path_error - returns true if error may be path related * @error: status the request was completed with @@ -370,7 +375,7 @@ enum req_flag_bits { __REQ_SWAP, /* swap I/O */ __REQ_DRV, /* for driver use */ __REQ_FS_PRIVATE, /* for file system (submitter) use */ - + __REQ_ATOMIC, /* for atomic write operations */ /* * Command specific flags, keep last: */ @@ -402,6 +407,7 @@ enum req_flag_bits { #define REQ_SWAP (__force blk_opf_t)(1ULL << __REQ_SWAP) #define REQ_DRV (__force blk_opf_t)(1ULL << __REQ_DRV) #define REQ_FS_PRIVATE (__force blk_opf_t)(1ULL << __REQ_FS_PRIVATE) +#define REQ_ATOMIC (__force blk_opf_t)(1ULL << __REQ_ATOMIC) #define REQ_NOUNMAP (__force blk_opf_t)(1ULL << __REQ_NOUNMAP) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index ddff90766f9f..930debeba3f0 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -310,6 +310,16 @@ struct queue_limits { unsigned int discard_alignment; unsigned int zone_write_granularity; + /* atomic write limits */ + unsigned int atomic_write_hw_max; + unsigned int atomic_write_max_sectors; + unsigned int atomic_write_hw_boundary; + unsigned int atomic_write_boundary_sectors; + unsigned int atomic_write_hw_unit_min; + unsigned int atomic_write_unit_min; + unsigned int atomic_write_hw_unit_max; + unsigned int atomic_write_unit_max; + unsigned short max_segments; unsigned short max_integrity_segments; unsigned short max_discard_segments; @@ -1355,6 +1365,30 @@ static inline int queue_dma_alignment(const struct request_queue *q) return q ? q->limits.dma_alignment : 511; } +static inline unsigned int +queue_atomic_write_unit_max_bytes(const struct request_queue *q) +{ + return q->limits.atomic_write_unit_max; +} + +static inline unsigned int +queue_atomic_write_unit_min_bytes(const struct request_queue *q) +{ + return q->limits.atomic_write_unit_min; +} + +static inline unsigned int +queue_atomic_write_boundary_bytes(const struct request_queue *q) +{ + return q->limits.atomic_write_boundary_sectors << SECTOR_SHIFT; +} + +static inline unsigned int +queue_atomic_write_max_bytes(const struct request_queue *q) +{ + return q->limits.atomic_write_max_sectors << SECTOR_SHIFT; +} + static inline unsigned int bdev_dma_alignment(struct block_device *bdev) { return queue_dma_alignment(bdev_get_queue(bdev)); @@ -1596,6 +1630,27 @@ struct io_comp_batch { void (*complete)(struct io_comp_batch *); }; +static inline bool bdev_can_atomic_write(struct block_device *bdev) +{ + struct request_queue *bd_queue = bdev->bd_queue; + struct queue_limits *limits = &bd_queue->limits; + + if (!limits->atomic_write_unit_min) + return false; + + if (bdev_is_partition(bdev)) { + sector_t bd_start_sect = bdev->bd_start_sect; + unsigned int alignment = + max(limits->atomic_write_unit_min, + limits->atomic_write_hw_boundary); + + if (!IS_ALIGNED(bd_start_sect, alignment >> SECTOR_SHIFT)) + return false; + } + + return true; +} + #define DEFINE_IO_COMP_BATCH(name) struct io_comp_batch name = { } #endif /* _LINUX_BLKDEV_H */ From patchwork Mon Jun 10 10:43:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John Garry X-Patchwork-Id: 804097 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BE13139583; Mon, 10 Jun 2024 10:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=205.220.165.32 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016294; cv=fail; b=cwobYNeRyFCIqEMjSrWB7usSkSJijsZ5Pb2xf8fdO32cOL/2r6igLEnDyq/Hc3ny+ljhOmd9IAtb+9bjHsfjT2+0KryuUvBs6WwNBN6t1f6L3j5BVbYmw9t3kT0Qt1ouTMI9+1cNp5vviOm7MKjrdX++hpz4DKfdtLAQ8ccN+Sg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718016294; c=relaxed/simple; bh=iki/1NYZ9YoGpsiuRdEgpVnO8uPeskpeOIqE392MusA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=sMMDWB+3ygcYVenZrcIpqOiKjWvG8KUR3QzsNSNQ7rc7xvd9v3DLCL5elTjl+ShBwkDoVL0WcOpD51y/bWAcJLIjtbH9bZplEQaTOr4XxD7AZXB455LVqA/5D41K9fypyrbaVdJf6446VJUZVVM007ZyODdN//EXx+U02+fh0rM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=KjXCX7SU; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b=azU+s3lS; arc=fail smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="KjXCX7SU"; dkim=pass (1024-bit key) header.d=oracle.onmicrosoft.com header.i=@oracle.onmicrosoft.com header.b="azU+s3lS" Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 45A4BRx6024983; Mon, 10 Jun 2024 10:44:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h= from:to:cc:subject:date:message-id:in-reply-to:references :content-type:content-transfer-encoding:mime-version; s= corp-2023-11-20; bh=IzYMElQdoafm/KOiCJoqkqo8DHg2mUS9JMz/MJ0HTqY=; b= KjXCX7SUri0Sl3DgGrzH5Y6jtNB87c2mdEXOBWob+HIJUVYxKMGp6ELzyiVCl1W/ btslEEuLhMZmreSn3Slz6d7T4rLkcWIQFUuB8UyetjMQCERmRbnl84L3018hIbOU s5FzyqejswV344MQ0r3G+NL0leWo+JE7WmA92d0AntjWRIrKoirFMyCr/BkHJ6q0 VupPq+Ks1oRlzcZKwg2qWS5HrNpSXLgvItcSbHwr2jktUYJAevPQw74ILvJ/W88J /fOZxmeUEROhUI1huL7VqqU+k3YOdDtVZgGISu6fBaLmmTkebDPM7omPygeU4l0I bsA8h7qC++yMLWlcqKEyNg== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3ymh7dj8j2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:44:16 +0000 (GMT) Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 45AA4BwO021545; Mon, 10 Jun 2024 10:44:14 GMT Received: from nam12-dm6-obe.outbound.protection.outlook.com (mail-dm6nam12lp2168.outbound.protection.outlook.com [104.47.59.168]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3yncasuep3-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 10 Jun 2024 10:44:14 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Wtl/DwLov5dVcfFR7HI+E9HK+UepQoYsyZMK6k+/t0/u349WfU0IsgCGI9mCKaKbFNFMNgQJBpjC3tLSq+mbwncw1qPRPL6W16pE+98onf42jQ3ojxtQm9ejIezfj7IaAqea4JRZxUa7BMiDQ8ATLSuynDDw4IlK46muNFhSxohr0eiB+DLD/7Tqdh7f4KpvglBTAVJFW/FcLMYcu7q48Q09J69C5MweRaPOudzsIkQRHT8trNR4x5tlZZJzg32j4J5w1zaRZ63FPgUomREoN6Zz6VEmJGzkxUNbEVLbMJ/AWpaFg9+1jz6ujG/C/ISkZl6BxgcyQRoJ0qtsq3Zygg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IzYMElQdoafm/KOiCJoqkqo8DHg2mUS9JMz/MJ0HTqY=; b=W9DEgM4HoBgv33VT2ZQ4OJeVQJBFdMISM028jxgAl++5fPE7/v2/2BxKl8G3wEH91qzNB83l5G7QyWYIpNazkXu+8hFdY+un13R6hMbH/QLnzxK2yHRTED/NIKGKrHf50zzib5AB5pXCcDGvSc/tGWkEdLtL3bv9p4qPWzcA1JiXak7atHehc9kH9gyIhLUruxsGQjPN4/SA8cIlFOOYggr7X9AO2gRua+7gbveER5unt7ar+gNH2ENyEOAifmxD3Xb6qmcy2cVA8PcJ8+WdzR4nxDqZtddiU9OsO/qKD7qfVmGHR5upC6p+mmhaoU4gnYB+eu/AYOrwfPPLNfrfAw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IzYMElQdoafm/KOiCJoqkqo8DHg2mUS9JMz/MJ0HTqY=; b=azU+s3lSx3xkoKCalq0N0SRq38SrbVfuYYTdH/IW9EgiigKdPYF8XyZfNNURfQ1B7EtwPdfBkOAv0nIAHVC1qVQluAV+vFORoZeQ+WBKpjoBpEy9Aiz2vAOZ4squCZA/8tKne7fMfzTGU6xFykFFl6ttSP4BL94vNUjsFc0P6sw= Received: from DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) by SJ0PR10MB5613.namprd10.prod.outlook.com (2603:10b6:a03:3d0::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7633.31; Mon, 10 Jun 2024 10:44:11 +0000 Received: from DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088]) by DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::4f45:f4ab:121:e088%6]) with mapi id 15.20.7633.036; Mon, 10 Jun 2024 10:44:11 +0000 From: John Garry To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz Cc: djwong@kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ojaswin@linux.ibm.com, linux-aio@kvack.org, linux-btrfs@vger.kernel.org, io-uring@vger.kernel.org, nilay@linux.ibm.com, ritesh.list@gmail.com, willy@infradead.org, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, hare@suse.de, Alan Adamson , John Garry Subject: [PATCH v8 10/10] nvme: Atomic write support Date: Mon, 10 Jun 2024 10:43:29 +0000 Message-Id: <20240610104329.3555488-11-john.g.garry@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240610104329.3555488-1-john.g.garry@oracle.com> References: <20240610104329.3555488-1-john.g.garry@oracle.com> X-ClientProxiedBy: LO4P265CA0036.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:2ae::12) To DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR10MB4313:EE_|SJ0PR10MB5613:EE_ X-MS-Office365-Filtering-Correlation-Id: c3f4dc8f-58ea-4540-1aae-08dc893a4341 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230031|7416005|376005|1800799015|366007|921011; X-Microsoft-Antispam-Message-Info: =?utf-8?q?yVpC5KDkBGUcHZUJPXyY/o9YAq5r9r4?= =?utf-8?q?dA550VptaiIPtFJZnn5P/tsrm+7f0C/NNthf2nsWJF+qaCDE5OSuSiyRo8avvdgw8?= =?utf-8?q?r4vDwm+JPbMu7JwatOOKRnwcA4K0/s9cNlyd/IsYUR2Tp992pP1yLxvlz60h+oYno?= =?utf-8?q?hgCd68n0u0dOCFzazT4eg6bZ0Ezi6A/Bua/vf13F1gFMEmVsgFozI1hYbMcexCs4U?= =?utf-8?q?caNq70Hrqw2QQU8N4zZ9BwD1aJH/mvwwtyEIf8SwJdu7RUOYFPHNYHWktAZHGDsP5?= =?utf-8?q?kCCubrnUllEj9hqSYWN2W+f1wJpxoVq485YMJptUBwhNAkeh/eRDyeFOBFoM5wlzN?= =?utf-8?q?buLYLJTL725bvIl7/zL/mD8S1bjHKKCfT/FBXt89iXqhVEmke265UTGhTv0XYkCu3?= =?utf-8?q?g/l19sC3kOWPDaa7CB8dGH6ybv0lCfMsDcnOGQMMHGvpPAs8Bq15b6Yy9ZHbpkQax?= =?utf-8?q?Sba1dXmtR0UeyOLoAFb6vcZKA0cJwsuIDVs+toDKJNI2qOcpsEbdTZZOIWoKSKzsN?= =?utf-8?q?Rae5RygVma8+/J35J8z4SGAOT3Y5rhXpRkKrJo/m+IaJqacQypYva+J4wIfayki98?= =?utf-8?q?E0yZsIYYRA01kWPNt5AqEbgeePTwPNqvCH7oHijgauTN02sdn+zKkjvL1LJh/RJLy?= =?utf-8?q?8bJ7nfhxUDTxZFXEHPAEUGS85TUaoJIsQbg4O4EykBXvGY9bVPKUijIkKLoNMnl64?= =?utf-8?q?89JAdLiPSqqxTtb3uIlkFDwZ37NHs8PTe32s60DDxQ029VvnHyG6a8ys6QFXMyONc?= =?utf-8?q?6ROj8ae6l4qthz8gBPuorSq4F/3sIgDb1fGe7/QlrG3jeoCULfZgn51lf20gRwg0d?= =?utf-8?q?88g0Xb9v/M3gBnO4lU2m9GzOOw/CEG+nR/qy0xUl1nSBzz/aIFIcqVGGN2AE30Ai7?= =?utf-8?q?v/sR0DoM/fCC5g2ZIccecu7x/PYOc06ud7b5u5aCIg3XIj+u36PXTDuGmarx/vTqk?= =?utf-8?q?otrJhQIzkBWR1/bmbCtQOuICaiDErNvxycaX0EJGTZ/rONcB6LXisBUi5j2HuXYWV?= =?utf-8?q?ptWvMbRC4N+uAkCjKJpuOp21mLG2e6Ne2EYz2pEx6NM9YCkGRLepgvez2EGVLKdOk?= =?utf-8?q?cEjFxy47J+/eCY1v8vwKNIgUdRfzjYna1czCwWseGWMQhsCa+weA7NM4WG1L2dzhA?= =?utf-8?q?AK4pzdREF6uyXwM3xfu3UiIa7RnZwXPYFHluEfHED0wheAjEQOUKdiDvDBInkvMHJ?= =?utf-8?q?FHD7RR/muvUtNg2Gr+LBtZrBO6RHBoUtIM1Acyksi+8kiVZ14b207C6mIO9SQ4Byg?= =?utf-8?q?w0ttzQBOHXtc2JGeP2S6dvPk0biwfYN8vK3U33dHvg80ETJJh2fXdtpA=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR10MB4313.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(7416005)(376005)(1800799015)(366007)(921011); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?APV6D4KoJFuw3d9zkoLoQaeRTAh7?= =?utf-8?q?50o58ZyPnubLRKX8R8aAnVVMu6IMXOtkEkmoXfi633uB7LZSNmP36NXL1nHteC7Ye?= =?utf-8?q?Uhs2Lj8SHzKGxphw/As9s7ZyE0CLBMhuRb+Fd3Ug7Vi5tL39Na/+1DqDooqLSv+E2?= =?utf-8?q?GIVrqP/DT/uE0MyP7y6daGA7FPrjIDUoKrvX+Fp8oKoxaALclKzYPj+zl2nug5Nvv?= =?utf-8?q?JMVBmtfUx+z+JxizX4KeEogvz+GBnr4/7qduRzho8V5xI4A7z3t/GQjukvruScFie?= =?utf-8?q?OwjCXsVwfYgDEjKnvg6o+UQeiSa6R0euO7JvHaCFlN/sfb5kbBVhIMQJKtprJlbS1?= =?utf-8?q?Cce0IR8hbDDDIljxCyPBZFd/IMoYL+7ZBNL9QesJr/E+qvjehITsk70Hjx/evzZ7v?= =?utf-8?q?r5ze1jvTLr4U28hrAoGvRLLZ7B1lvHY5jvtWqXhesSbf4J0wU8kDw/IHe45OWGbIH?= =?utf-8?q?UZyqfzC9CLd2oejd3JYr6hHZ2q6SnYHbhXywOMn/h9B3lBMNuWEEkdASHJUmNGyUG?= =?utf-8?q?Ucw1wkKE9P0ykCdclCTRfXjAeGMHygtuma8n/Xg6tKp2ap01wLRT/6hx7Tcgy4sHA?= =?utf-8?q?HPTL4MlFYpre4remBXAXA8hElqQeCll9CBJlCR+nmZyXJkCsJQl5DLDbvONxn0j2J?= =?utf-8?q?z29xyTQDmT/aCB6f95M1HoHMqLdnQMFgmsMn0pHYqpwhYY1QJ8xwvbSyoGBCUyvOQ?= =?utf-8?q?w5HmS91qgfoY244q0nFVF6HEH8JiSwpVffD7QVwko4PurFxz0ZEfaybuHUjr3jwT0?= =?utf-8?q?PnbFRFRnsLaAWPUgIPvvOB3Mexod9h5RYg0ghbqbonaWJcO9bhPtg4fDfGMFyZ8M0?= =?utf-8?q?SdILalLMta0y2xwCREbjBEIOCdyZr4re05k5YlbmsJ7ZuWsdfplTg2PMAx8jmvdAg?= =?utf-8?q?8FZpmgs1eKLLDOgWx3HWWgfOBOeVpeWlOC3rMp6myB4wHC8NhWG0P2GPA0M02F28Z?= =?utf-8?q?dxW+X+qPvn06ERnGpTcUoGnNZxBORp+sNmpSfkr0reOUa9ApzXsApv5e8cFUc/rbd?= =?utf-8?q?2p2ae7werphvRJfe3HqTc2BR7K0SADdBw4gO6jIxVsQyr8Cb+OK0dCpCJoHUibPB4?= =?utf-8?q?SIxcF2+3APtslan/1lP0pwpqyQ6GAlnKEPF2S+0KsS2X7VGpc1Yb0cFmJsS4MEUKB?= =?utf-8?q?NQwm90Jo4KjSJ3UJy1KZG+FHiX7mNTFUggxhmLsljJ+XDQfmPUYpu5JWISMVGV0C3?= =?utf-8?q?s7HiOZcT92p7NVuA493yVbDOYR/wIqT+1lG+x5Msh9RUZUW5BE+xrQ8dNpc/5fUUC?= =?utf-8?q?tfrK8Oac5xChauRO6ZTvpsIA4kG5fVYfwhx2t5r/jbqMA4v93mOxIHzX+Q0oqBCAk?= =?utf-8?q?6PoiOIS8ObR1zRBNzOq5L9dQt5RdomTCTFZFk5oojw2LDK1enu5xtjPfe+pW1BTcp?= =?utf-8?q?lgKKZaYdDVKvJ3fpnH2tly5pw3nqI5G5jH+VKb2DsW9cWUMo6leP/ohieWd99Wezx?= =?utf-8?q?wogCoUmeD+TNQYlRqMlzNfkkj3eJtxBXqQC7HHB3J0gEZPWvHQY6CNdZX+PzyOtTw?= =?utf-8?q?13/7F/w0fPJP+X1hn30fOpfqVNDX4GP23w=3D=3D?= X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: ak7TbKT66zYXtPBGpHMW4paXrl2ibaiugxnwrEvR6REUEdIKDfnDwEWTqYNXTDTyBTALMXk2r2fpcrsdT5b5fT3J/2SrM5X4kppn4rjOVSOhLYjfMuK4BdU9b/TibkQ3E0+/UwbMlv37SICXNS9N8jSMJst6/mdqODOeNQMqdsdvW28xhIUVC4vof2x8OMkQFRipRoHx9619WPs/FiyerawuITW8xU01Hoi0D/LUSN8B33e+l1VpLo4vshMFOYdh9vMvG+x2fNOeOfcldYSgEGtFeKMR0JSAIMKqKKCGmL69gSF8+D5Oi7PQq2s9Mi7Vwn+xS6aCMPjuavOv/gOhHxBUsVH6gxdFSJW76HYuZmaCiLvlgfqYcoXLdzOFOXgV8U74gRh2gw9VzU5IBvLaTOh9SsWVDo83h7DyV9xkjHG9VKW4sLIj8Arhs613LiHfgnbV26kMtftTntBbLEuCGDO482rcwGbKpXVVoN5Jjcdln3YJCptJ8f2G3+ShgL1GjjlcUbOrekrv7AmMs/+50z41n9wIYwwp3jcDW7O1zBwT+kFobWS/Kjjp3UMu1zsLm/lMxKaQEI7qIXBXAKECU6CIWIc/hEW5C9C2Qg7iYlY= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: c3f4dc8f-58ea-4540-1aae-08dc893a4341 X-MS-Exchange-CrossTenant-AuthSource: DM6PR10MB4313.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jun 2024 10:44:11.0211 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: HyWTxtyrc6n7LDVLpJvs0gKax0gych5oGOvFGySS9EIoJbW2TqetlTxbV488HqVV9WpJ3Pyw9c5kT0eE1NtBqQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR10MB5613 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.680,FMLib:17.12.28.16 definitions=2024-06-10_02,2024-06-10_01,2024-05-17_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 suspectscore=0 malwarescore=0 spamscore=0 mlxscore=0 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2406100081 X-Proofpoint-GUID: 3oVZGuYJx9fSa9lTTXE_tmRHvvyxknO- X-Proofpoint-ORIG-GUID: 3oVZGuYJx9fSa9lTTXE_tmRHvvyxknO- From: Alan Adamson Add support to set block layer request_queue atomic write limits. The limits will be derived from either the namespace or controller atomic parameters. NVMe atomic-related parameters are grouped into "normal" and "power-fail" (or PF) class of parameter. For atomic write support, only PF parameters are of interest. The "normal" parameters are concerned with racing reads and writes (which also applies to PF). See NVM Command Set Specification Revision 1.0d section 2.1.4 for reference. Whether to use per namespace or controller atomic parameters is decided by NSFEAT bit 1 - see Figure 97: Identify – Identify Namespace Data Structure, NVM Command Set. NVMe namespaces may define an atomic boundary, whereby no atomic guarantees are provided for a write which straddles this per-lba space boundary. The block layer merging policy is such that no merges may occur in which the resultant request would straddle such a boundary. Unlike SCSI, NVMe specifies no granularity or alignment rules, apart from atomic boundary rule. In addition, again unlike SCSI, there is no dedicated atomic write command - a write which adheres to the atomic size limit and boundary is implicitly atomic. If NSFEAT bit 1 is set, the following parameters are of interest: - NAWUPF (Namespace Atomic Write Unit Power Fail) - NABSPF (Namespace Atomic Boundary Size Power Fail) - NABO (Namespace Atomic Boundary Offset) and we set request_queue limits as follows: - atomic_write_unit_max = rounddown_pow_of_two(NAWUPF) - atomic_write_max_bytes = NAWUPF - atomic_write_boundary = NABSPF If in the unlikely scenario that NABO is non-zero, then atomic writes will not be supported at all as dealing with this adds extra complexity. This policy may change in future. In all cases, atomic_write_unit_min is set to the logical block size. If NSFEAT bit 1 is unset, the following parameter is of interest: - AWUPF (Atomic Write Unit Power Fail) and we set request_queue limits as follows: - atomic_write_unit_max = rounddown_pow_of_two(AWUPF) - atomic_write_max_bytes = AWUPF - atomic_write_boundary = 0 A new function, nvme_valid_atomic_write(), is also called from submission path to verify that a request has been submitted to the driver will actually be executed atomically. As mentioned, there is no dedicated NVMe atomic write command (which may error for a command which exceeds the controller atomic write limits). Note on NABSPF: There seems to be some vagueness in the spec as to whether NABSPF applies for NSFEAT bit 1 being unset. Figure 97 does not explicitly mention NABSPF and how it is affected by bit 1. However Figure 4 does tell to check Figure 97 for info about per-namespace parameters, which NABSPF is, so it is implied. However currently nvme_update_disk_info() does check namespace parameter NABO regardless of this bit. Signed-off-by: Alan Adamson Reviewed-by: Keith Busch jpg: total rewrite Signed-off-by: John Garry --- drivers/nvme/host/core.c | 49 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 49 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index f5d150c62955..91001892f60b 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -927,6 +927,30 @@ static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns, return BLK_STS_OK; } +static bool nvme_valid_atomic_write(struct request *req) +{ + struct request_queue *q = req->q; + u32 boundary_bytes = queue_atomic_write_boundary_bytes(q); + + if (blk_rq_bytes(req) > queue_atomic_write_unit_max_bytes(q)) + return false; + + if (boundary_bytes) { + u64 mask = boundary_bytes - 1, imask = ~mask; + u64 start = blk_rq_pos(req) << SECTOR_SHIFT; + u64 end = start + blk_rq_bytes(req) - 1; + + /* If greater then must be crossing a boundary */ + if (blk_rq_bytes(req) > boundary_bytes) + return false; + + if ((start & imask) != (end & imask)) + return false; + } + + return true; +} + static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd, enum nvme_opcode op) @@ -941,6 +965,12 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns, if (req->cmd_flags & REQ_RAHEAD) dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; + /* + * Ensure that nothing has been sent which cannot be executed + * atomically. + */ + if (req->cmd_flags & REQ_ATOMIC && !nvme_valid_atomic_write(req)) + return BLK_STS_INVAL; cmnd->rw.opcode = op; cmnd->rw.flags = 0; @@ -1921,6 +1951,23 @@ static void nvme_configure_metadata(struct nvme_ctrl *ctrl, } } + +static void nvme_update_atomic_write_disk_info(struct nvme_ns *ns, + struct nvme_id_ns *id, struct queue_limits *lim, + u32 bs, u32 atomic_bs) +{ + unsigned int boundary = 0; + + if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf) { + if (le16_to_cpu(id->nabspf)) + boundary = (le16_to_cpu(id->nabspf) + 1) * bs; + } + lim->atomic_write_hw_max = atomic_bs; + lim->atomic_write_hw_boundary = boundary; + lim->atomic_write_hw_unit_min = bs; + lim->atomic_write_hw_unit_max = rounddown_pow_of_two(atomic_bs); +} + static u32 nvme_max_drv_segments(struct nvme_ctrl *ctrl) { return ctrl->max_hw_sectors / (NVME_CTRL_PAGE_SIZE >> SECTOR_SHIFT) + 1; @@ -1967,6 +2014,8 @@ static bool nvme_update_disk_info(struct nvme_ns *ns, struct nvme_id_ns *id, atomic_bs = (1 + le16_to_cpu(id->nawupf)) * bs; else atomic_bs = (1 + ns->ctrl->subsys->awupf) * bs; + + nvme_update_atomic_write_disk_info(ns, id, lim, bs, atomic_bs); } if (id->nsfeat & NVME_NS_FEAT_IO_OPT) {