From patchwork Mon Dec 14 08:23:40 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "\(Exiting\) Baolin Wang" <baolin.wang@linaro.org>
X-Patchwork-Id: 58324
Delivered-To: patch@linaro.org
Received: by 10.112.73.68 with SMTP id j4csp1319517lbv;
 Mon, 14 Dec 2015 00:24:56 -0800 (PST)
X-Received: by 10.66.228.225 with SMTP id sl1mr44024114pac.63.1450081494098; 
 Mon, 14 Dec 2015 00:24:54 -0800 (PST)
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id 82si7260280pfq.220.2015.12.14.00.24.53;
 Mon, 14 Dec 2015 00:24:54 -0800 (PST)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dkim=neutral (body hash did not verify)
 header.i=@linaro-org.20150623.gappssmtp.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1753010AbbLNIYw (ORCPT <rfc822;pingbo.wen@linaro.org>
 + 28 others); Mon, 14 Dec 2015 03:24:52 -0500
Received: from mail-pf0-f170.google.com ([209.85.192.170]:34696 "EHLO
 mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1752922AbbLNIYs (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Mon, 14 Dec 2015 03:24:48 -0500
Received: by pfbo64 with SMTP id o64so23313188pfb.1
 for <linux-kernel@vger.kernel.org>;
 Mon, 14 Dec 2015 00:24:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=linaro-org.20150623.gappssmtp.com; s=20150623;
 h=from:to:cc:subject:date:message-id:in-reply-to:references
 :in-reply-to:references;
 bh=cNzTl4kfvbwj90B1Qdx7NMWRNjpcToF9QELxcXBJy1g=;
 b=rLzjQb/prOkesLZLFnsya8GE6G9rC4NhVohnCObZ/O5pgXOMvcjOoLWfCEmQw3JWgD
 rrhxYgUPvN7z41lhNSx4rbTTas7KZs1G4tFiZDUnpB5SyZNFxoLAmajJ2S7CHrk2h/p+
 Mml098+BPAmaHK8114NXk7tpNdGfDaT/1OG+9oKE7ieYGA24ZUZ1i/iXquKlDL/0/aT3
 9GEuguTJGKqL2nkdDEM0PBTanA50BNf+exl4Fdhu1FjfuHO5XSwJjsgbnQxzojXIy/88
 Va2ckD9wuvottD7gpKv7nKFLnnwFRDz8vI80JmpvNOy79ztbmi3zdXtHLRRazxbjW648
 6M6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:in-reply-to:references;
 bh=cNzTl4kfvbwj90B1Qdx7NMWRNjpcToF9QELxcXBJy1g=;
 b=iOU/dFphdP1eNsnHS78AtkEo0KebQF07jMLJww1iGSoZW7pjOtde8vLnm62HKq6ZUS
 hZN7V+1sTkZOL0E79FB+CDWUG7Z6osfleh/thVsxEUwVguB24t43Ckrw9TPaGeWPxG38
 3ZyZZkNvpyjSEY6T8itWfOInzPR0wm1Onwk2RXwIsOUM07LfD9rWKFh468xKI7FKyhnd
 lAUgV/JmHXlb1/vV5KyX5SHDHggoHlk8weh6GYyV0CvafqTGtteZ5n0syMTpUYd4TQHW
 HmNd1cU2aKgGZx5V6/NtEBuivLGuPFjv26SSVk2a1e+KPBDGKR5F8pvBFTPd4RZ/bgMc
 CDBQ==
X-Gm-Message-State: ALoCoQnRwFy9OiHBAXu9EhNgyD+r7XOIt2FYDjKffnmfxd4ncaHienPyEWs1t9P6mQf1saiViaumg4OPSB1e5qpEBXEL3TV2gg==
X-Received: by 10.98.7.91 with SMTP id b88mr34301466pfd.48.1450081488446;
 Mon, 14 Dec 2015 00:24:48 -0800 (PST)
Received: from baolinwangubtpc.spreadtrum.com ([175.111.195.49])
 by smtp.gmail.com with ESMTPSA id
 c1sm41056675pas.1.2015.12.14.00.24.43
 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Mon, 14 Dec 2015 00:24:47 -0800 (PST)
From: Baolin Wang <baolin.wang@linaro.org>
To: axboe@kernel.dk, agk@redhat.com, snitzer@redhat.com, dm-devel@redhat.com
Cc: neilb@suse.com, dan.j.williams@intel.com,
 martin.petersen@oracle.com, sagig@mellanox.com,
 kent.overstreet@gmail.com, keith.busch@intel.com, tj@kernel.org,
 broonie@kernel.org, arnd@arndb.de, linux-block@vger.kernel.org,
 linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org,
 baolin.wang@linaro.org
Subject: [PATCH 2/2] md: dm-crypt: Optimize the dm-crypt for XTS mode
Date: Mon, 14 Dec 2015 16:23:40 +0800
Message-Id: <02be0f42bf2d3c3d27b43bc050a783582b7af733.1450080755.git.baolin.wang@linaro.org>
X-Mailer: git-send-email 1.7.9.5
In-Reply-To: <cover.1450080755.git.baolin.wang@linaro.org>
References: <cover.1450080755.git.baolin.wang@linaro.org>
In-Reply-To: <cover.1450080755.git.baolin.wang@linaro.org>
References: <cover.1450080755.git.baolin.wang@linaro.org>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

In now dm-crypt code, it is ineffective to map one bio with just only
one scatterlist at one time for XTS mode. We can use multiple scatterlists
to map the whole bio and send all scatterlists of one bio to crypto engine
to encrypt or decrypt, which can improve the hardware engine's efficiency.

With this optimization, On my test setup (beaglebone black board) using 64KB
I/Os on an eMMC storage device I saw about 60% improvement in throughput for
encrypted writes, and about 100% improvement for encrypted reads. But this
is not fit for other modes which need different IV for each sector.

Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
---
 drivers/md/dm-crypt.c |  315 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 309 insertions(+), 6 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 917d47e..9f6f131 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -32,6 +32,7 @@
 #include <linux/device-mapper.h>
 
 #define DM_MSG_PREFIX "crypt"
+#define DM_MAX_SG_LIST	1024
 
 /*
  * context holding the current state of a multi-part conversion
@@ -68,6 +69,8 @@ struct dm_crypt_request {
 	struct convert_context *ctx;
 	struct scatterlist sg_in;
 	struct scatterlist sg_out;
+	struct sg_table sgt_in;
+	struct sg_table sgt_out;
 	sector_t iv_sector;
 };
 
@@ -140,6 +143,7 @@ struct crypt_config {
 	char *cipher;
 	char *cipher_string;
 
+	int bulk_crypto;
 	struct crypt_iv_operations *iv_gen_ops;
 	union {
 		struct iv_essiv_private essiv;
@@ -833,6 +837,11 @@ static u8 *iv_of_dmreq(struct crypt_config *cc,
 		crypto_ablkcipher_alignmask(any_tfm(cc)) + 1);
 }
 
+static int crypt_is_bulk_mode(struct crypt_config *cc)
+{
+	return cc->bulk_crypto;
+}
+
 static int crypt_convert_block(struct crypt_config *cc,
 			       struct convert_context *ctx,
 			       struct ablkcipher_request *req)
@@ -881,24 +890,40 @@ static int crypt_convert_block(struct crypt_config *cc,
 
 static void kcryptd_async_done(struct crypto_async_request *async_req,
 			       int error);
+static void kcryptd_async_all_done(struct crypto_async_request *async_req,
+				   int error);
 
 static void crypt_alloc_req(struct crypt_config *cc,
 			    struct convert_context *ctx)
 {
 	unsigned key_index = ctx->cc_sector & (cc->tfms_count - 1);
+	struct dm_crypt_request *dmreq;
 
 	if (!ctx->req)
 		ctx->req = mempool_alloc(cc->req_pool, GFP_NOIO);
 
+	dmreq = dmreq_of_req(cc, ctx->req);
+	dmreq->sgt_in.orig_nents = 0;
+	dmreq->sgt_out.orig_nents = 0;
+
 	ablkcipher_request_set_tfm(ctx->req, cc->tfms[key_index]);
 
 	/*
 	 * Use REQ_MAY_BACKLOG so a cipher driver internally backlogs
 	 * requests if driver request queue is full.
 	 */
-	ablkcipher_request_set_callback(ctx->req,
-	    CRYPTO_TFM_REQ_MAY_BACKLOG | CRYPTO_TFM_REQ_MAY_SLEEP,
-	    kcryptd_async_done, dmreq_of_req(cc, ctx->req));
+	if (crypt_is_bulk_mode(cc))
+		ablkcipher_request_set_callback(ctx->req,
+						CRYPTO_TFM_REQ_MAY_BACKLOG
+						| CRYPTO_TFM_REQ_MAY_SLEEP,
+						kcryptd_async_all_done,
+						dmreq_of_req(cc, ctx->req));
+	else
+		ablkcipher_request_set_callback(ctx->req,
+						CRYPTO_TFM_REQ_MAY_BACKLOG
+						| CRYPTO_TFM_REQ_MAY_SLEEP,
+						kcryptd_async_done,
+						dmreq_of_req(cc, ctx->req));
 }
 
 static void crypt_free_req(struct crypt_config *cc,
@@ -911,6 +936,221 @@ static void crypt_free_req(struct crypt_config *cc,
 }
 
 /*
+ * Check how many sg entry numbers are needed when map one bio
+ * with scatterlist in advance.
+ */
+static unsigned int crypt_sg_entry(struct bio *bio_t)
+{
+	struct request_queue *q = bdev_get_queue(bio_t->bi_bdev);
+	int cluster = blk_queue_cluster(q);
+	struct bio_vec bvec, bvprv = { NULL };
+	struct bvec_iter biter;
+	unsigned long nbytes = 0, sg_length = 0;
+	unsigned int sg_cnt = 0;
+
+	if (bio_t->bi_rw & REQ_DISCARD) {
+		if (bio_t->bi_vcnt)
+			return 1;
+		return 0;
+	}
+
+	if (bio_t->bi_rw & REQ_WRITE_SAME)
+		return 1;
+
+	bio_for_each_segment(bvec, bio_t, biter) {
+		nbytes = bvec.bv_len;
+
+		if (!cluster) {
+			sg_cnt++;
+			continue;
+		}
+
+		if (sg_length + nbytes > queue_max_segment_size(q)) {
+			sg_length = nbytes;
+			sg_cnt++;
+			goto next;
+		}
+
+		if (!BIOVEC_PHYS_MERGEABLE(&bvprv, &bvec)) {
+			sg_length = nbytes;
+			sg_cnt++;
+			goto next;
+		}
+
+		if (!BIOVEC_SEG_BOUNDARY(q, &bvprv, &bvec)) {
+			sg_length = nbytes;
+			sg_cnt++;
+			goto next;
+		}
+
+		sg_length += nbytes;
+next:
+		memcpy(&bvprv, &bvec, sizeof(struct bio_vec));
+	}
+
+	return sg_cnt;
+}
+
+static int crypt_convert_all_blocks(struct crypt_config *cc,
+				   struct convert_context *ctx,
+				   struct ablkcipher_request *req)
+{
+	struct dm_crypt_io *io =
+		container_of(ctx, struct dm_crypt_io, ctx);
+	struct dm_crypt_request *dmreq = dmreq_of_req(cc, req);
+	u8 *iv = iv_of_dmreq(cc, dmreq);
+	struct bio *orig_bio = io->base_bio;
+	struct bio *bio_in = ctx->bio_in;
+	struct bio *bio_out = ctx->bio_out;
+	unsigned int total_bytes = orig_bio->bi_iter.bi_size;
+	struct scatterlist *sg_in = NULL;
+	struct scatterlist *sg_out = NULL;
+	struct scatterlist *sg = NULL;
+	unsigned int total_sg_len_in = 0;
+	unsigned int total_sg_len_out = 0;
+	unsigned int sg_in_max = 0, sg_out_max = 0;
+	int ret;
+
+	dmreq->iv_sector = ctx->cc_sector;
+	dmreq->ctx = ctx;
+
+	/*
+	 * Need to calculate how many sg entry need to be used
+	 * for this bio.
+	 */
+	sg_in_max = crypt_sg_entry(bio_in) + 1;
+	if (sg_in_max > DM_MAX_SG_LIST || sg_in_max <= 0) {
+		DMERR("%s sg entry too large or none %d\n",
+		      __func__, sg_in_max);
+		return -EINVAL;
+	} else if (sg_in_max == 2) {
+		sg_in = &dmreq->sg_in;
+	}
+
+	if (!sg_in) {
+		ret = sg_alloc_table(&dmreq->sgt_in, sg_in_max, GFP_KERNEL);
+		if (ret) {
+			DMERR("%s sg in allocation failed\n", __func__);
+			return -ENOMEM;
+		}
+
+		sg_in = dmreq->sgt_in.sgl;
+	}
+
+	total_sg_len_in = __blk_bios_map_sg(bdev_get_queue(bio_in->bi_bdev),
+					    bio_in, sg_in, &sg);
+	if ((total_sg_len_in <= 0)
+	    || (total_sg_len_in > sg_in_max)) {
+		DMERR("%s in sg map error %d, sg_in_max[%d]\n",
+		      __func__, total_sg_len_in, sg_in_max);
+		return -EINVAL;
+	}
+
+	if (sg)
+		sg_mark_end(sg);
+
+	ctx->iter_in.bi_size -= total_bytes;
+
+	if (bio_data_dir(orig_bio) == READ)
+		goto set_crypt;
+
+	sg_out_max = crypt_sg_entry(bio_out) + 1;
+	if (sg_out_max > DM_MAX_SG_LIST || sg_out_max <= 0) {
+		DMERR("%s sg entry too large or none %d\n",
+		      __func__, sg_out_max);
+		return -EINVAL;
+	} else if (sg_out_max == 2) {
+		sg_out = &dmreq->sg_out;
+	}
+
+	if (!sg_out) {
+		ret = sg_alloc_table(&dmreq->sgt_out, sg_out_max, GFP_KERNEL);
+		if (ret) {
+			DMERR("%s sg out allocation failed\n", __func__);
+			return -ENOMEM;
+		}
+
+		sg_out = dmreq->sgt_out.sgl;
+	}
+
+	sg = NULL;
+	total_sg_len_out = __blk_bios_map_sg(bdev_get_queue(bio_out->bi_bdev),
+					     bio_out, sg_out, &sg);
+	if ((total_sg_len_out <= 0) ||
+	    (total_sg_len_out > sg_out_max)) {
+		DMERR("%s out sg map error %d, sg_out_max[%d]\n",
+		      __func__, total_sg_len_out, sg_out_max);
+		return -EINVAL;
+	}
+
+	if (sg)
+		sg_mark_end(sg);
+
+	ctx->iter_out.bi_size -= total_bytes;
+set_crypt:
+	if (cc->iv_gen_ops) {
+		ret = cc->iv_gen_ops->generator(cc, iv, dmreq);
+		if (ret < 0) {
+			DMERR("%s generator iv error %d\n", __func__, ret);
+			return ret;
+		}
+	}
+
+	if (bio_data_dir(orig_bio) == WRITE) {
+		ablkcipher_request_set_crypt(req, sg_in,
+					     sg_out, total_bytes, iv);
+
+		ret = crypto_ablkcipher_encrypt(req);
+	} else {
+		ablkcipher_request_set_crypt(req, sg_in,
+					     sg_in, total_bytes, iv);
+
+		ret = crypto_ablkcipher_decrypt(req);
+	}
+
+	if (!ret && cc->iv_gen_ops && cc->iv_gen_ops->post)
+		ret = cc->iv_gen_ops->post(cc, iv, dmreq);
+
+	return ret;
+}
+
+/*
+ * Encrypt / decrypt data from one whole bio at one time.
+ */
+static int crypt_convert_io(struct crypt_config *cc,
+			    struct convert_context *ctx)
+{
+	int r;
+
+	atomic_set(&ctx->cc_pending, 1);
+	crypt_alloc_req(cc, ctx);
+	atomic_inc(&ctx->cc_pending);
+
+	r = crypt_convert_all_blocks(cc, ctx, ctx->req);
+	switch (r) {
+	case -EBUSY:
+		/*
+		 * Lets make this synchronous bio by waiting on
+		 * in progress as well.
+		 */
+	case -EINPROGRESS:
+		wait_for_completion(&ctx->restart);
+		ctx->req = NULL;
+		break;
+	case 0:
+		atomic_dec(&ctx->cc_pending);
+		cond_resched();
+		break;
+	/* There was an error while processing the request. */
+	default:
+		atomic_dec(&ctx->cc_pending);
+		return r;
+	}
+
+	return 0;
+}
+
+/*
  * Encrypt / decrypt data from one bio to another one (can be the same one)
  */
 static int crypt_convert(struct crypt_config *cc,
@@ -1070,12 +1310,18 @@ static void crypt_dec_pending(struct dm_crypt_io *io)
 	struct crypt_config *cc = io->cc;
 	struct bio *base_bio = io->base_bio;
 	int error = io->error;
+	struct dm_crypt_request *dmreq;
 
 	if (!atomic_dec_and_test(&io->io_pending))
 		return;
 
-	if (io->ctx.req)
+	if (io->ctx.req) {
+		dmreq = dmreq_of_req(cc, io->ctx.req);
+		sg_free_table(&dmreq->sgt_out);
+		sg_free_table(&dmreq->sgt_in);
+
 		crypt_free_req(cc, io->ctx.req, base_bio);
+	}
 
 	base_bio->bi_error = error;
 	bio_endio(base_bio);
@@ -1312,7 +1558,11 @@ static void kcryptd_crypt_write_convert(struct dm_crypt_io *io)
 	sector += bio_sectors(clone);
 
 	crypt_inc_pending(io);
-	r = crypt_convert(cc, &io->ctx);
+	if (crypt_is_bulk_mode(cc))
+		r = crypt_convert_io(cc, &io->ctx);
+	else
+		r = crypt_convert(cc, &io->ctx);
+
 	if (r)
 		io->error = -EIO;
 	crypt_finished = atomic_dec_and_test(&io->ctx.cc_pending);
@@ -1342,7 +1592,11 @@ static void kcryptd_crypt_read_convert(struct dm_crypt_io *io)
 	crypt_convert_init(cc, &io->ctx, io->base_bio, io->base_bio,
 			   io->sector);
 
-	r = crypt_convert(cc, &io->ctx);
+	if (crypt_is_bulk_mode(cc))
+		r = crypt_convert_io(cc, &io->ctx);
+	else
+		r = crypt_convert(cc, &io->ctx);
+
 	if (r < 0)
 		io->error = -EIO;
 
@@ -1387,6 +1641,40 @@ static void kcryptd_async_done(struct crypto_async_request *async_req,
 		kcryptd_crypt_write_io_submit(io, 1);
 }
 
+static void kcryptd_async_all_done(struct crypto_async_request *async_req,
+			       int error)
+{
+	struct dm_crypt_request *dmreq = async_req->data;
+	struct convert_context *ctx = dmreq->ctx;
+	struct dm_crypt_io *io = container_of(ctx, struct dm_crypt_io, ctx);
+	struct crypt_config *cc = io->cc;
+
+	if (error == -EINPROGRESS)
+		return;
+
+	if (!error && cc->iv_gen_ops && cc->iv_gen_ops->post)
+		error = cc->iv_gen_ops->post(cc, iv_of_dmreq(cc, dmreq), dmreq);
+
+	if (error < 0)
+		io->error = error;
+
+	sg_free_table(&dmreq->sgt_out);
+	sg_free_table(&dmreq->sgt_in);
+
+	crypt_free_req(cc, req_of_dmreq(cc, dmreq), io->base_bio);
+
+	if (!atomic_dec_and_test(&ctx->cc_pending)) {
+		complete(&io->ctx.restart);
+		return;
+	}
+
+	complete(&io->ctx.restart);
+	if (bio_data_dir(io->base_bio) == READ)
+		kcryptd_crypt_read_done(io);
+	else
+		kcryptd_crypt_write_io_submit(io, 1);
+}
+
 static void kcryptd_crypt(struct work_struct *work)
 {
 	struct dm_crypt_io *io = container_of(work, struct dm_crypt_io, work);
@@ -1633,6 +1921,21 @@ static int crypt_ctr_cipher(struct dm_target *ti,
 		goto bad_mem;
 	}
 
+	/*
+	 * Here we need to check if it can be encrypted or decrypted with
+	 * bulk block, which means these encryption modes don't need IV or
+	 * just need one initial IV. For bulk mode, we can expand the
+	 * scatterlist entries to map the bio, then send all the scatterlists
+	 * to the hardware engine at one time to improve the crypto engine
+	 * efficiency. But it does not fit for other encryption modes, it has
+	 * to do encryption and decryption sector by sector because every
+	 * sector has different IV.
+	 */
+	if (!strcmp(chainmode, "ecb") || !strcmp(chainmode, "xts"))
+		cc->bulk_crypto = 1;
+	else
+		cc->bulk_crypto = 0;
+
 	/* Allocate cipher */
 	ret = crypt_alloc_tfms(cc, cipher_api);
 	if (ret < 0) {