From patchwork Wed Mar 25 08:27:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 221903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DAE8AC54FD0 for ; Wed, 25 Mar 2020 08:30:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A423920409 for ; Wed, 25 Mar 2020 08:30:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hjzQkLqk" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727380AbgCYIaF (ORCPT ); Wed, 25 Mar 2020 04:30:05 -0400 Received: from us-smtp-delivery-74.mimecast.com ([63.128.21.74]:42170 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726239AbgCYIaD (ORCPT ); Wed, 25 Mar 2020 04:30:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585125001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qv2sf5ZLe8MVzng446R1oQxVfhG/+fRJUSulJ/I8CAY=; b=hjzQkLqkZ84gQ9X9IWjSvhd4KG/S2z+p9uCOwcV6b9UwKIPEzQqbHeJLvvth+R63+SITPC HsZE3n9pl3+/0NkVoXGk/i12rY7vcnVJDag42AZBcFlMnubnPtHAGWpveNUClEatOd/kPb qnGwTWeICMWqd1ZusiO1EtyZaRY/yhE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-466-cuucHGfzOnWFqH-bOVKv-g-1; Wed, 25 Mar 2020 04:30:00 -0400 X-MC-Unique: cuucHGfzOnWFqH-bOVKv-g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 526041902EA1; Wed, 25 Mar 2020 08:29:57 +0000 (UTC) Received: from jason-ThinkPad-X1-Carbon-6th.redhat.com (ovpn-14-13.pek2.redhat.com [10.72.14.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id CD583194BB; Wed, 25 Mar 2020 08:28:58 +0000 (UTC) From: Jason Wang To: mst@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Cc: jgg@mellanox.com, maxime.coquelin@redhat.com, cunming.liang@intel.com, zhihong.wang@intel.com, rob.miller@broadcom.com, xiao.w.wang@intel.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, parav@mellanox.com, kevin.tian@intel.com, stefanha@redhat.com, rdunlap@infradead.org, hch@infradead.org, aadam@redhat.com, jiri@mellanox.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, Jason Wang Subject: [PATCH V8 2/9] vhost: allow per device message handler Date: Wed, 25 Mar 2020 16:27:04 +0800 Message-Id: <20200325082711.1107-3-jasowang@redhat.com> In-Reply-To: <20200325082711.1107-1-jasowang@redhat.com> References: <20200325082711.1107-1-jasowang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch allow device to register its own message handler during vhost_dev_init(). vDPA device will use it to implement its own DMA mapping logic. Signed-off-by: Jason Wang --- drivers/vhost/net.c | 3 ++- drivers/vhost/scsi.c | 2 +- drivers/vhost/vhost.c | 12 ++++++++++-- drivers/vhost/vhost.h | 6 +++++- drivers/vhost/vsock.c | 2 +- 5 files changed, 19 insertions(+), 6 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index e158159671fa..c8ab8d83b530 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1324,7 +1324,8 @@ static int vhost_net_open(struct inode *inode, struct file *f) } vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX, UIO_MAXIOV + VHOST_NET_BATCH, - VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT); + VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT, + NULL); vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, dev); vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev); diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 0b949a14bce3..7653667a8cdc 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1628,7 +1628,7 @@ static int vhost_scsi_open(struct inode *inode, struct file *f) vs->vqs[i].vq.handle_kick = vhost_scsi_handle_kick; } vhost_dev_init(&vs->dev, vqs, VHOST_SCSI_MAX_VQ, UIO_MAXIOV, - VHOST_SCSI_WEIGHT, 0); + VHOST_SCSI_WEIGHT, 0, NULL); vhost_scsi_init_inflight(vs, NULL); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index f44340b41494..8e9e2341e40a 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -457,7 +457,9 @@ static size_t vhost_get_desc_size(struct vhost_virtqueue *vq, void vhost_dev_init(struct vhost_dev *dev, struct vhost_virtqueue **vqs, int nvqs, - int iov_limit, int weight, int byte_weight) + int iov_limit, int weight, int byte_weight, + int (*msg_handler)(struct vhost_dev *dev, + struct vhost_iotlb_msg *msg)) { struct vhost_virtqueue *vq; int i; @@ -473,6 +475,7 @@ void vhost_dev_init(struct vhost_dev *dev, dev->iov_limit = iov_limit; dev->weight = weight; dev->byte_weight = byte_weight; + dev->msg_handler = msg_handler; init_llist_head(&dev->work_list); init_waitqueue_head(&dev->wait); INIT_LIST_HEAD(&dev->read_list); @@ -1178,7 +1181,12 @@ ssize_t vhost_chr_write_iter(struct vhost_dev *dev, ret = -EINVAL; goto done; } - if (vhost_process_iotlb_msg(dev, &msg)) { + + if (dev->msg_handler) + ret = dev->msg_handler(dev, &msg); + else + ret = vhost_process_iotlb_msg(dev, &msg); + if (ret) { ret = -EFAULT; goto done; } diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index a123fd70847e..f9d1a03dd153 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -174,11 +174,15 @@ struct vhost_dev { int weight; int byte_weight; u64 kcov_handle; + int (*msg_handler)(struct vhost_dev *dev, + struct vhost_iotlb_msg *msg); }; bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int total_len); void vhost_dev_init(struct vhost_dev *, struct vhost_virtqueue **vqs, - int nvqs, int iov_limit, int weight, int byte_weight); + int nvqs, int iov_limit, int weight, int byte_weight, + int (*msg_handler)(struct vhost_dev *dev, + struct vhost_iotlb_msg *msg)); long vhost_dev_set_owner(struct vhost_dev *dev); bool vhost_dev_has_owner(struct vhost_dev *dev); long vhost_dev_check_owner(struct vhost_dev *); diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index c2d7d57e98cf..97669484a3f6 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -621,7 +621,7 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) vhost_dev_init(&vsock->dev, vqs, ARRAY_SIZE(vsock->vqs), UIO_MAXIOV, VHOST_VSOCK_PKT_WEIGHT, - VHOST_VSOCK_WEIGHT); + VHOST_VSOCK_WEIGHT, NULL); file->private_data = vsock; spin_lock_init(&vsock->send_pkt_list_lock); From patchwork Wed Mar 25 08:27:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 221902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C6FEC18E5B for ; Wed, 25 Mar 2020 08:32:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1389A20714 for ; Wed, 25 Mar 2020 08:32:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I03Y8vr2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727280AbgCYIcE (ORCPT ); Wed, 25 Mar 2020 04:32:04 -0400 Received: from us-smtp-delivery-74.mimecast.com ([63.128.21.74]:20114 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726239AbgCYIcD (ORCPT ); Wed, 25 Mar 2020 04:32:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585125122; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ezal15hgulE5KkDZEmS+B/Xi9s+uUpzAylUKzo6x+cE=; b=I03Y8vr2IdBIK1e7TvTInyo5leqebvMhJAnuU2f8agcFdjJdf45ZCw1zyPbavNZX3c668/ ghPeKbxuUibZ03MakhYrFtblBj9WzBArAR0hYjegWjOpK2sFI/zI/Y6vmZz9W0jfiGC97r a0/9xiXVBBB1CGEnpfG5CDm/45GoW1Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-110-spahbGZSPZOdTfl03mdQaA-1; Wed, 25 Mar 2020 04:31:58 -0400 X-MC-Unique: spahbGZSPZOdTfl03mdQaA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A56FF190B2A1; Wed, 25 Mar 2020 08:31:55 +0000 (UTC) Received: from jason-ThinkPad-X1-Carbon-6th.redhat.com (ovpn-14-13.pek2.redhat.com [10.72.14.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4FCE3CFCD; Wed, 25 Mar 2020 08:30:59 +0000 (UTC) From: Jason Wang To: mst@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Cc: jgg@mellanox.com, maxime.coquelin@redhat.com, cunming.liang@intel.com, zhihong.wang@intel.com, rob.miller@broadcom.com, xiao.w.wang@intel.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, parav@mellanox.com, kevin.tian@intel.com, stefanha@redhat.com, rdunlap@infradead.org, hch@infradead.org, aadam@redhat.com, jiri@mellanox.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, Jason Wang Subject: [PATCH V8 4/9] vringh: IOTLB support Date: Wed, 25 Mar 2020 16:27:06 +0800 Message-Id: <20200325082711.1107-5-jasowang@redhat.com> In-Reply-To: <20200325082711.1107-1-jasowang@redhat.com> References: <20200325082711.1107-1-jasowang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch implements the third memory accessor for vringh besides current kernel and userspace accessors. This idea is to allow vringh to do the address translation through an IOTLB which is implemented via vhost_map interval tree. Users should setup and IOVA to PA mapping in this IOTLB. This allows us to: - Use vringh to access virtqueues with vIOMMU - Use vringh to implement software virtqueues for vDPA devices Signed-off-by: Jason Wang --- drivers/vhost/Kconfig | 1 + drivers/vhost/vringh.c | 421 ++++++++++++++++++++++++++++++++++++++--- include/linux/vringh.h | 36 ++++ 3 files changed, 435 insertions(+), 23 deletions(-) diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig index 2f6c2105b953..b03872886901 100644 --- a/drivers/vhost/Kconfig +++ b/drivers/vhost/Kconfig @@ -6,6 +6,7 @@ config VHOST_IOTLB config VHOST_RING tristate + select VHOST_IOTLB help This option is selected by any driver which needs to access the host side of a virtio ring. diff --git a/drivers/vhost/vringh.c b/drivers/vhost/vringh.c index a0a2d74967ef..ee0491f579ac 100644 --- a/drivers/vhost/vringh.c +++ b/drivers/vhost/vringh.c @@ -13,6 +13,9 @@ #include #include #include +#include +#include +#include #include static __printf(1,2) __cold void vringh_bad(const char *fmt, ...) @@ -71,9 +74,11 @@ static inline int __vringh_get_head(const struct vringh *vrh, } /* Copy some bytes to/from the iovec. Returns num copied. */ -static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov, +static inline ssize_t vringh_iov_xfer(struct vringh *vrh, + struct vringh_kiov *iov, void *ptr, size_t len, - int (*xfer)(void *addr, void *ptr, + int (*xfer)(const struct vringh *vrh, + void *addr, void *ptr, size_t len)) { int err, done = 0; @@ -82,7 +87,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov, size_t partlen; partlen = min(iov->iov[iov->i].iov_len, len); - err = xfer(iov->iov[iov->i].iov_base, ptr, partlen); + err = xfer(vrh, iov->iov[iov->i].iov_base, ptr, partlen); if (err) return err; done += partlen; @@ -96,6 +101,7 @@ static inline ssize_t vringh_iov_xfer(struct vringh_kiov *iov, /* Fix up old iov element then increment. */ iov->iov[iov->i].iov_len = iov->consumed; iov->iov[iov->i].iov_base -= iov->consumed; + iov->consumed = 0; iov->i++; @@ -227,7 +233,8 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src, u64 addr, struct vringh_range *r), struct vringh_range *range, - int (*copy)(void *dst, const void *src, size_t len)) + int (*copy)(const struct vringh *vrh, + void *dst, const void *src, size_t len)) { size_t part, len = sizeof(struct vring_desc); @@ -241,7 +248,7 @@ static int slow_copy(struct vringh *vrh, void *dst, const void *src, if (!rcheck(vrh, addr, &part, range, getrange)) return -EINVAL; - err = copy(dst, src, part); + err = copy(vrh, dst, src, part); if (err) return err; @@ -262,7 +269,8 @@ __vringh_iov(struct vringh *vrh, u16 i, struct vringh_range *)), bool (*getrange)(struct vringh *, u64, struct vringh_range *), gfp_t gfp, - int (*copy)(void *dst, const void *src, size_t len)) + int (*copy)(const struct vringh *vrh, + void *dst, const void *src, size_t len)) { int err, count = 0, up_next, desc_max; struct vring_desc desc, *descs; @@ -291,7 +299,7 @@ __vringh_iov(struct vringh *vrh, u16 i, err = slow_copy(vrh, &desc, &descs[i], rcheck, getrange, &slowrange, copy); else - err = copy(&desc, &descs[i], sizeof(desc)); + err = copy(vrh, &desc, &descs[i], sizeof(desc)); if (unlikely(err)) goto fail; @@ -404,7 +412,8 @@ static inline int __vringh_complete(struct vringh *vrh, unsigned int num_used, int (*putu16)(const struct vringh *vrh, __virtio16 *p, u16 val), - int (*putused)(struct vring_used_elem *dst, + int (*putused)(const struct vringh *vrh, + struct vring_used_elem *dst, const struct vring_used_elem *src, unsigned num)) { @@ -420,12 +429,12 @@ static inline int __vringh_complete(struct vringh *vrh, /* Compiler knows num_used == 1 sometimes, hence extra check */ if (num_used > 1 && unlikely(off + num_used >= vrh->vring.num)) { u16 part = vrh->vring.num - off; - err = putused(&used_ring->ring[off], used, part); + err = putused(vrh, &used_ring->ring[off], used, part); if (!err) - err = putused(&used_ring->ring[0], used + part, + err = putused(vrh, &used_ring->ring[0], used + part, num_used - part); } else - err = putused(&used_ring->ring[off], used, num_used); + err = putused(vrh, &used_ring->ring[off], used, num_used); if (err) { vringh_bad("Failed to write %u used entries %u at %p", @@ -564,13 +573,15 @@ static inline int putu16_user(const struct vringh *vrh, __virtio16 *p, u16 val) return put_user(v, (__force __virtio16 __user *)p); } -static inline int copydesc_user(void *dst, const void *src, size_t len) +static inline int copydesc_user(const struct vringh *vrh, + void *dst, const void *src, size_t len) { return copy_from_user(dst, (__force void __user *)src, len) ? -EFAULT : 0; } -static inline int putused_user(struct vring_used_elem *dst, +static inline int putused_user(const struct vringh *vrh, + struct vring_used_elem *dst, const struct vring_used_elem *src, unsigned int num) { @@ -578,13 +589,15 @@ static inline int putused_user(struct vring_used_elem *dst, sizeof(*dst) * num) ? -EFAULT : 0; } -static inline int xfer_from_user(void *src, void *dst, size_t len) +static inline int xfer_from_user(const struct vringh *vrh, void *src, + void *dst, size_t len) { return copy_from_user(dst, (__force void __user *)src, len) ? -EFAULT : 0; } -static inline int xfer_to_user(void *dst, void *src, size_t len) +static inline int xfer_to_user(const struct vringh *vrh, + void *dst, void *src, size_t len) { return copy_to_user((__force void __user *)dst, src, len) ? -EFAULT : 0; @@ -706,7 +719,7 @@ EXPORT_SYMBOL(vringh_getdesc_user); */ ssize_t vringh_iov_pull_user(struct vringh_iov *riov, void *dst, size_t len) { - return vringh_iov_xfer((struct vringh_kiov *)riov, + return vringh_iov_xfer(NULL, (struct vringh_kiov *)riov, dst, len, xfer_from_user); } EXPORT_SYMBOL(vringh_iov_pull_user); @@ -722,7 +735,7 @@ EXPORT_SYMBOL(vringh_iov_pull_user); ssize_t vringh_iov_push_user(struct vringh_iov *wiov, const void *src, size_t len) { - return vringh_iov_xfer((struct vringh_kiov *)wiov, + return vringh_iov_xfer(NULL, (struct vringh_kiov *)wiov, (void *)src, len, xfer_to_user); } EXPORT_SYMBOL(vringh_iov_push_user); @@ -832,13 +845,15 @@ static inline int putu16_kern(const struct vringh *vrh, __virtio16 *p, u16 val) return 0; } -static inline int copydesc_kern(void *dst, const void *src, size_t len) +static inline int copydesc_kern(const struct vringh *vrh, + void *dst, const void *src, size_t len) { memcpy(dst, src, len); return 0; } -static inline int putused_kern(struct vring_used_elem *dst, +static inline int putused_kern(const struct vringh *vrh, + struct vring_used_elem *dst, const struct vring_used_elem *src, unsigned int num) { @@ -846,13 +861,15 @@ static inline int putused_kern(struct vring_used_elem *dst, return 0; } -static inline int xfer_kern(void *src, void *dst, size_t len) +static inline int xfer_kern(const struct vringh *vrh, void *src, + void *dst, size_t len) { memcpy(dst, src, len); return 0; } -static inline int kern_xfer(void *dst, void *src, size_t len) +static inline int kern_xfer(const struct vringh *vrh, void *dst, + void *src, size_t len) { memcpy(dst, src, len); return 0; @@ -949,7 +966,7 @@ EXPORT_SYMBOL(vringh_getdesc_kern); */ ssize_t vringh_iov_pull_kern(struct vringh_kiov *riov, void *dst, size_t len) { - return vringh_iov_xfer(riov, dst, len, xfer_kern); + return vringh_iov_xfer(NULL, riov, dst, len, xfer_kern); } EXPORT_SYMBOL(vringh_iov_pull_kern); @@ -964,7 +981,7 @@ EXPORT_SYMBOL(vringh_iov_pull_kern); ssize_t vringh_iov_push_kern(struct vringh_kiov *wiov, const void *src, size_t len) { - return vringh_iov_xfer(wiov, (void *)src, len, kern_xfer); + return vringh_iov_xfer(NULL, wiov, (void *)src, len, kern_xfer); } EXPORT_SYMBOL(vringh_iov_push_kern); @@ -1042,4 +1059,362 @@ int vringh_need_notify_kern(struct vringh *vrh) } EXPORT_SYMBOL(vringh_need_notify_kern); +static int iotlb_translate(const struct vringh *vrh, + u64 addr, u64 len, struct bio_vec iov[], + int iov_size, u32 perm) +{ + struct vhost_iotlb_map *map; + struct vhost_iotlb *iotlb = vrh->iotlb; + int ret = 0; + u64 s = 0; + + while (len > s) { + u64 size, pa, pfn; + + if (unlikely(ret >= iov_size)) { + ret = -ENOBUFS; + break; + } + + map = vhost_iotlb_itree_first(iotlb, addr, + addr + len - 1); + if (!map || map->start > addr) { + ret = -EINVAL; + break; + } else if (!(map->perm & perm)) { + ret = -EPERM; + break; + } + + size = map->size - addr + map->start; + pa = map->addr + addr - map->start; + pfn = pa >> PAGE_SHIFT; + iov[ret].bv_page = pfn_to_page(pfn); + iov[ret].bv_len = min(len - s, size); + iov[ret].bv_offset = pa & (PAGE_SIZE - 1); + s += size; + addr += size; + ++ret; + } + + return ret; +} + +static inline int copy_from_iotlb(const struct vringh *vrh, void *dst, + void *src, size_t len) +{ + struct iov_iter iter; + struct bio_vec iov[16]; + int ret; + + ret = iotlb_translate(vrh, (u64)(uintptr_t)src, + len, iov, 16, VHOST_MAP_RO); + if (ret < 0) + return ret; + + iov_iter_bvec(&iter, READ, iov, ret, len); + + ret = copy_from_iter(dst, len, &iter); + + return ret; +} + +static inline int copy_to_iotlb(const struct vringh *vrh, void *dst, + void *src, size_t len) +{ + struct iov_iter iter; + struct bio_vec iov[16]; + int ret; + + ret = iotlb_translate(vrh, (u64)(uintptr_t)dst, + len, iov, 16, VHOST_MAP_WO); + if (ret < 0) + return ret; + + iov_iter_bvec(&iter, WRITE, iov, ret, len); + + return copy_to_iter(src, len, &iter); +} + +static inline int getu16_iotlb(const struct vringh *vrh, + u16 *val, const __virtio16 *p) +{ + struct bio_vec iov; + void *kaddr, *from; + int ret; + + /* Atomic read is needed for getu16 */ + ret = iotlb_translate(vrh, (u64)(uintptr_t)p, sizeof(*p), + &iov, 1, VHOST_MAP_RO); + if (ret < 0) + return ret; + + kaddr = kmap_atomic(iov.bv_page); + from = kaddr + iov.bv_offset; + *val = vringh16_to_cpu(vrh, READ_ONCE(*(__virtio16 *)from)); + kunmap_atomic(kaddr); + + return 0; +} + +static inline int putu16_iotlb(const struct vringh *vrh, + __virtio16 *p, u16 val) +{ + struct bio_vec iov; + void *kaddr, *to; + int ret; + + /* Atomic write is needed for putu16 */ + ret = iotlb_translate(vrh, (u64)(uintptr_t)p, sizeof(*p), + &iov, 1, VHOST_MAP_WO); + if (ret < 0) + return ret; + + kaddr = kmap_atomic(iov.bv_page); + to = kaddr + iov.bv_offset; + WRITE_ONCE(*(__virtio16 *)to, cpu_to_vringh16(vrh, val)); + kunmap_atomic(kaddr); + + return 0; +} + +static inline int copydesc_iotlb(const struct vringh *vrh, + void *dst, const void *src, size_t len) +{ + int ret; + + ret = copy_from_iotlb(vrh, dst, (void *)src, len); + if (ret != len) + return -EFAULT; + + return 0; +} + +static inline int xfer_from_iotlb(const struct vringh *vrh, void *src, + void *dst, size_t len) +{ + int ret; + + ret = copy_from_iotlb(vrh, dst, src, len); + if (ret != len) + return -EFAULT; + + return 0; +} + +static inline int xfer_to_iotlb(const struct vringh *vrh, + void *dst, void *src, size_t len) +{ + int ret; + + ret = copy_to_iotlb(vrh, dst, src, len); + if (ret != len) + return -EFAULT; + + return 0; +} + +static inline int putused_iotlb(const struct vringh *vrh, + struct vring_used_elem *dst, + const struct vring_used_elem *src, + unsigned int num) +{ + int size = num * sizeof(*dst); + int ret; + + ret = copy_to_iotlb(vrh, dst, (void *)src, num * sizeof(*dst)); + if (ret != size) + return -EFAULT; + + return 0; +} + +/** + * vringh_init_iotlb - initialize a vringh for a ring with IOTLB. + * @vrh: the vringh to initialize. + * @features: the feature bits for this ring. + * @num: the number of elements. + * @weak_barriers: true if we only need memory barriers, not I/O. + * @desc: the userpace descriptor pointer. + * @avail: the userpace avail pointer. + * @used: the userpace used pointer. + * + * Returns an error if num is invalid. + */ +int vringh_init_iotlb(struct vringh *vrh, u64 features, + unsigned int num, bool weak_barriers, + struct vring_desc *desc, + struct vring_avail *avail, + struct vring_used *used) +{ + return vringh_init_kern(vrh, features, num, weak_barriers, + desc, avail, used); +} +EXPORT_SYMBOL(vringh_init_iotlb); + +/** + * vringh_set_iotlb - initialize a vringh for a ring with IOTLB. + * @vrh: the vring + * @iotlb: iotlb associated with this vring + */ +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb) +{ + vrh->iotlb = iotlb; +} +EXPORT_SYMBOL(vringh_set_iotlb); + +/** + * vringh_getdesc_iotlb - get next available descriptor from ring with + * IOTLB. + * @vrh: the kernelspace vring. + * @riov: where to put the readable descriptors (or NULL) + * @wiov: where to put the writable descriptors (or NULL) + * @head: head index we received, for passing to vringh_complete_iotlb(). + * @gfp: flags for allocating larger riov/wiov. + * + * Returns 0 if there was no descriptor, 1 if there was, or -errno. + * + * Note that on error return, you can tell the difference between an + * invalid ring and a single invalid descriptor: in the former case, + * *head will be vrh->vring.num. You may be able to ignore an invalid + * descriptor, but there's not much you can do with an invalid ring. + * + * Note that you may need to clean up riov and wiov, even on error! + */ +int vringh_getdesc_iotlb(struct vringh *vrh, + struct vringh_kiov *riov, + struct vringh_kiov *wiov, + u16 *head, + gfp_t gfp) +{ + int err; + + err = __vringh_get_head(vrh, getu16_iotlb, &vrh->last_avail_idx); + if (err < 0) + return err; + + /* Empty... */ + if (err == vrh->vring.num) + return 0; + + *head = err; + err = __vringh_iov(vrh, *head, riov, wiov, no_range_check, NULL, + gfp, copydesc_iotlb); + if (err) + return err; + + return 1; +} +EXPORT_SYMBOL(vringh_getdesc_iotlb); + +/** + * vringh_iov_pull_iotlb - copy bytes from vring_iov. + * @vrh: the vring. + * @riov: the riov as passed to vringh_getdesc_iotlb() (updated as we consume) + * @dst: the place to copy. + * @len: the maximum length to copy. + * + * Returns the bytes copied <= len or a negative errno. + */ +ssize_t vringh_iov_pull_iotlb(struct vringh *vrh, + struct vringh_kiov *riov, + void *dst, size_t len) +{ + return vringh_iov_xfer(vrh, riov, dst, len, xfer_from_iotlb); +} +EXPORT_SYMBOL(vringh_iov_pull_iotlb); + +/** + * vringh_iov_push_iotlb - copy bytes into vring_iov. + * @vrh: the vring. + * @wiov: the wiov as passed to vringh_getdesc_iotlb() (updated as we consume) + * @dst: the place to copy. + * @len: the maximum length to copy. + * + * Returns the bytes copied <= len or a negative errno. + */ +ssize_t vringh_iov_push_iotlb(struct vringh *vrh, + struct vringh_kiov *wiov, + const void *src, size_t len) +{ + return vringh_iov_xfer(vrh, wiov, (void *)src, len, xfer_to_iotlb); +} +EXPORT_SYMBOL(vringh_iov_push_iotlb); + +/** + * vringh_abandon_iotlb - we've decided not to handle the descriptor(s). + * @vrh: the vring. + * @num: the number of descriptors to put back (ie. num + * vringh_get_iotlb() to undo). + * + * The next vringh_get_iotlb() will return the old descriptor(s) again. + */ +void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num) +{ + /* We only update vring_avail_event(vr) when we want to be notified, + * so we haven't changed that yet. + */ + vrh->last_avail_idx -= num; +} +EXPORT_SYMBOL(vringh_abandon_iotlb); + +/** + * vringh_complete_iotlb - we've finished with descriptor, publish it. + * @vrh: the vring. + * @head: the head as filled in by vringh_getdesc_iotlb. + * @len: the length of data we have written. + * + * You should check vringh_need_notify_iotlb() after one or more calls + * to this function. + */ +int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len) +{ + struct vring_used_elem used; + + used.id = cpu_to_vringh32(vrh, head); + used.len = cpu_to_vringh32(vrh, len); + + return __vringh_complete(vrh, &used, 1, putu16_iotlb, putused_iotlb); +} +EXPORT_SYMBOL(vringh_complete_iotlb); + +/** + * vringh_notify_enable_iotlb - we want to know if something changes. + * @vrh: the vring. + * + * This always enables notifications, but returns false if there are + * now more buffers available in the vring. + */ +bool vringh_notify_enable_iotlb(struct vringh *vrh) +{ + return __vringh_notify_enable(vrh, getu16_iotlb, putu16_iotlb); +} +EXPORT_SYMBOL(vringh_notify_enable_iotlb); + +/** + * vringh_notify_disable_iotlb - don't tell us if something changes. + * @vrh: the vring. + * + * This is our normal running state: we disable and then only enable when + * we're going to sleep. + */ +void vringh_notify_disable_iotlb(struct vringh *vrh) +{ + __vringh_notify_disable(vrh, putu16_iotlb); +} +EXPORT_SYMBOL(vringh_notify_disable_iotlb); + +/** + * vringh_need_notify_iotlb - must we tell the other side about used buffers? + * @vrh: the vring we've called vringh_complete_iotlb() on. + * + * Returns -errno or 0 if we don't need to tell the other side, 1 if we do. + */ +int vringh_need_notify_iotlb(struct vringh *vrh) +{ + return __vringh_need_notify(vrh, getu16_iotlb); +} +EXPORT_SYMBOL(vringh_need_notify_iotlb); + + MODULE_LICENSE("GPL"); diff --git a/include/linux/vringh.h b/include/linux/vringh.h index d237087eb257..bd0503ca6f8f 100644 --- a/include/linux/vringh.h +++ b/include/linux/vringh.h @@ -14,6 +14,8 @@ #include #include #include +#include +#include #include /* virtio_ring with information needed for host access. */ @@ -39,6 +41,9 @@ struct vringh { /* The vring (note: it may contain user pointers!) */ struct vring vring; + /* IOTLB for this vring */ + struct vhost_iotlb *iotlb; + /* The function to call to notify the guest about added buffers */ void (*notify)(struct vringh *); }; @@ -248,4 +253,35 @@ static inline __virtio64 cpu_to_vringh64(const struct vringh *vrh, u64 val) { return __cpu_to_virtio64(vringh_is_little_endian(vrh), val); } + +void vringh_set_iotlb(struct vringh *vrh, struct vhost_iotlb *iotlb); + +int vringh_init_iotlb(struct vringh *vrh, u64 features, + unsigned int num, bool weak_barriers, + struct vring_desc *desc, + struct vring_avail *avail, + struct vring_used *used); + +int vringh_getdesc_iotlb(struct vringh *vrh, + struct vringh_kiov *riov, + struct vringh_kiov *wiov, + u16 *head, + gfp_t gfp); + +ssize_t vringh_iov_pull_iotlb(struct vringh *vrh, + struct vringh_kiov *riov, + void *dst, size_t len); +ssize_t vringh_iov_push_iotlb(struct vringh *vrh, + struct vringh_kiov *wiov, + const void *src, size_t len); + +void vringh_abandon_iotlb(struct vringh *vrh, unsigned int num); + +int vringh_complete_iotlb(struct vringh *vrh, u16 head, u32 len); + +bool vringh_notify_enable_iotlb(struct vringh *vrh); +void vringh_notify_disable_iotlb(struct vringh *vrh); + +int vringh_need_notify_iotlb(struct vringh *vrh); + #endif /* _LINUX_VRINGH_H */ From patchwork Wed Mar 25 08:27:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 221901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDA0DC54FD0 for ; Wed, 25 Mar 2020 08:33:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A52FD20409 for ; Wed, 25 Mar 2020 08:33:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SK+bifKq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727466AbgCYIdm (ORCPT ); Wed, 25 Mar 2020 04:33:42 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:29042 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727376AbgCYIdm (ORCPT ); Wed, 25 Mar 2020 04:33:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585125220; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0A80/fZ+pkCQYcnkCKyC43WtTpczDhthy4abNtOjSew=; b=SK+bifKqG3tsDPyGgRL94UPPTOt1g/EDQYLtZb3rP4ckn0xh99m36WRLkAqgGZlfDRVgtG twHRTVy7Si3m3QmFzIfEv1tRWuuO6xV1lFV0TS+KdyQDN3nu32j/Q/nNwEoP+YCWl0wXAB Mve3ra+IIQCLujjEML/UMLO5igy2Y3I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-251-OfykapGDMz-MbVAkmQv38Q-1; Wed, 25 Mar 2020 04:33:35 -0400 X-MC-Unique: OfykapGDMz-MbVAkmQv38Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 93930100550D; Wed, 25 Mar 2020 08:33:32 +0000 (UTC) Received: from jason-ThinkPad-X1-Carbon-6th.redhat.com (ovpn-14-13.pek2.redhat.com [10.72.14.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id 16B39A7F1; Wed, 25 Mar 2020 08:32:50 +0000 (UTC) From: Jason Wang To: mst@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Cc: jgg@mellanox.com, maxime.coquelin@redhat.com, cunming.liang@intel.com, zhihong.wang@intel.com, rob.miller@broadcom.com, xiao.w.wang@intel.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, parav@mellanox.com, kevin.tian@intel.com, stefanha@redhat.com, rdunlap@infradead.org, hch@infradead.org, aadam@redhat.com, jiri@mellanox.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, Jason Wang Subject: [PATCH V8 6/9] virtio: introduce a vDPA based transport Date: Wed, 25 Mar 2020 16:27:08 +0800 Message-Id: <20200325082711.1107-7-jasowang@redhat.com> In-Reply-To: <20200325082711.1107-1-jasowang@redhat.com> References: <20200325082711.1107-1-jasowang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch introduces a vDPA transport for virtio. This is used to use kernel virtio driver to drive the vDPA device that is capable of populating virtqueue directly. A new virtio-vdpa driver will be registered to the vDPA bus, when a new virtio-vdpa device is probed, it will register the device with vdpa based config ops. This means it is a software transport between vDPA driver and vDPA device. The transport was implemented through bus_ops of vDPA parent. Signed-off-by: Jason Wang --- drivers/virtio/Kconfig | 13 ++ drivers/virtio/Makefile | 1 + drivers/virtio/virtio_vdpa.c | 396 +++++++++++++++++++++++++++++++++++ 3 files changed, 410 insertions(+) create mode 100644 drivers/virtio/virtio_vdpa.c diff --git a/drivers/virtio/Kconfig b/drivers/virtio/Kconfig index 9c4fdb64d9ac..99e424570644 100644 --- a/drivers/virtio/Kconfig +++ b/drivers/virtio/Kconfig @@ -43,6 +43,19 @@ config VIRTIO_PCI_LEGACY If unsure, say Y. +config VIRTIO_VDPA + tristate "vDPA driver for virtio devices" + select VDPA + select VIRTIO + help + This driver provides support for virtio based paravirtual + device driver over vDPA bus. For this to be useful, you need + an appropriate vDPA device implementation that operates on a + physical device to allow the datapath of virtio to be + offloaded to hardware. + + If unsure, say M. + config VIRTIO_PMEM tristate "Support for virtio pmem driver" depends on VIRTIO diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile index fdf5eacd0d0a..3407ac03fe60 100644 --- a/drivers/virtio/Makefile +++ b/drivers/virtio/Makefile @@ -6,4 +6,5 @@ virtio_pci-y := virtio_pci_modern.o virtio_pci_common.o virtio_pci-$(CONFIG_VIRTIO_PCI_LEGACY) += virtio_pci_legacy.o obj-$(CONFIG_VIRTIO_BALLOON) += virtio_balloon.o obj-$(CONFIG_VIRTIO_INPUT) += virtio_input.o +obj-$(CONFIG_VIRTIO_VDPA) += virtio_vdpa.o obj-$(CONFIG_VDPA) += vdpa/ diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c new file mode 100644 index 000000000000..c30eb55030be --- /dev/null +++ b/drivers/virtio/virtio_vdpa.c @@ -0,0 +1,396 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VIRTIO based driver for vDPA device + * + * Copyright (c) 2020, Red Hat. All rights reserved. + * Author: Jason Wang + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MOD_VERSION "0.1" +#define MOD_AUTHOR "Jason Wang " +#define MOD_DESC "vDPA bus driver for virtio devices" +#define MOD_LICENSE "GPL v2" + +struct virtio_vdpa_device { + struct virtio_device vdev; + struct vdpa_device *vdpa; + u64 features; + + /* The lock to protect virtqueue list */ + spinlock_t lock; + /* List of virtio_vdpa_vq_info */ + struct list_head virtqueues; +}; + +struct virtio_vdpa_vq_info { + /* the actual virtqueue */ + struct virtqueue *vq; + + /* the list node for the virtqueues list */ + struct list_head node; +}; + +static inline struct virtio_vdpa_device * +to_virtio_vdpa_device(struct virtio_device *dev) +{ + return container_of(dev, struct virtio_vdpa_device, vdev); +} + +static struct vdpa_device *vd_get_vdpa(struct virtio_device *vdev) +{ + return to_virtio_vdpa_device(vdev)->vdpa; +} + +static void virtio_vdpa_get(struct virtio_device *vdev, unsigned offset, + void *buf, unsigned len) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + ops->get_config(vdpa, offset, buf, len); +} + +static void virtio_vdpa_set(struct virtio_device *vdev, unsigned offset, + const void *buf, unsigned len) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + ops->set_config(vdpa, offset, buf, len); +} + +static u32 virtio_vdpa_generation(struct virtio_device *vdev) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + if (ops->get_generation) + return ops->get_generation(vdpa); + + return 0; +} + +static u8 virtio_vdpa_get_status(struct virtio_device *vdev) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->get_status(vdpa); +} + +static void virtio_vdpa_set_status(struct virtio_device *vdev, u8 status) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->set_status(vdpa, status); +} + +static void virtio_vdpa_reset(struct virtio_device *vdev) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->set_status(vdpa, 0); +} + +static bool virtio_vdpa_notify(struct virtqueue *vq) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vq->vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + ops->kick_vq(vdpa, vq->index); + + return true; +} + +static irqreturn_t virtio_vdpa_config_cb(void *private) +{ + struct virtio_vdpa_device *vd_dev = private; + + virtio_config_changed(&vd_dev->vdev); + + return IRQ_HANDLED; +} + +static irqreturn_t virtio_vdpa_virtqueue_cb(void *private) +{ + struct virtio_vdpa_vq_info *info = private; + + return vring_interrupt(0, info->vq); +} + +static struct virtqueue * +virtio_vdpa_setup_vq(struct virtio_device *vdev, unsigned int index, + void (*callback)(struct virtqueue *vq), + const char *name, bool ctx) +{ + struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vdev); + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + struct virtio_vdpa_vq_info *info; + struct vdpa_callback cb; + struct virtqueue *vq; + u64 desc_addr, driver_addr, device_addr; + unsigned long flags; + u32 align, num; + int err; + + if (!name) + return NULL; + + /* Queue shouldn't already be set up. */ + if (ops->get_vq_ready(vdpa, index)) + return ERR_PTR(-ENOENT); + + /* Allocate and fill out our active queue description */ + info = kmalloc(sizeof(*info), GFP_KERNEL); + if (!info) + return ERR_PTR(-ENOMEM); + + num = ops->get_vq_num_max(vdpa); + if (num == 0) { + err = -ENOENT; + goto error_new_virtqueue; + } + + /* Create the vring */ + align = ops->get_vq_align(vdpa); + vq = vring_create_virtqueue(index, num, align, vdev, + true, true, ctx, + virtio_vdpa_notify, callback, name); + if (!vq) { + err = -ENOMEM; + goto error_new_virtqueue; + } + + /* Setup virtqueue callback */ + cb.callback = virtio_vdpa_virtqueue_cb; + cb.private = info; + ops->set_vq_cb(vdpa, index, &cb); + ops->set_vq_num(vdpa, index, virtqueue_get_vring_size(vq)); + + desc_addr = virtqueue_get_desc_addr(vq); + driver_addr = virtqueue_get_avail_addr(vq); + device_addr = virtqueue_get_used_addr(vq); + + if (ops->set_vq_address(vdpa, index, + desc_addr, driver_addr, + device_addr)) { + err = -EINVAL; + goto err_vq; + } + + ops->set_vq_ready(vdpa, index, 1); + + vq->priv = info; + info->vq = vq; + + spin_lock_irqsave(&vd_dev->lock, flags); + list_add(&info->node, &vd_dev->virtqueues); + spin_unlock_irqrestore(&vd_dev->lock, flags); + + return vq; + +err_vq: + vring_del_virtqueue(vq); +error_new_virtqueue: + ops->set_vq_ready(vdpa, index, 0); + /* VDPA driver should make sure vq is stopeed here */ + WARN_ON(ops->get_vq_ready(vdpa, index)); + kfree(info); + return ERR_PTR(err); +} + +static void virtio_vdpa_del_vq(struct virtqueue *vq) +{ + struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vq->vdev); + struct vdpa_device *vdpa = vd_dev->vdpa; + const struct vdpa_config_ops *ops = vdpa->config; + struct virtio_vdpa_vq_info *info = vq->priv; + unsigned int index = vq->index; + unsigned long flags; + + spin_lock_irqsave(&vd_dev->lock, flags); + list_del(&info->node); + spin_unlock_irqrestore(&vd_dev->lock, flags); + + /* Select and deactivate the queue */ + ops->set_vq_ready(vdpa, index, 0); + WARN_ON(ops->get_vq_ready(vdpa, index)); + + vring_del_virtqueue(vq); + + kfree(info); +} + +static void virtio_vdpa_del_vqs(struct virtio_device *vdev) +{ + struct virtqueue *vq, *n; + + list_for_each_entry_safe(vq, n, &vdev->vqs, list) + virtio_vdpa_del_vq(vq); +} + +static int virtio_vdpa_find_vqs(struct virtio_device *vdev, unsigned nvqs, + struct virtqueue *vqs[], + vq_callback_t *callbacks[], + const char * const names[], + const bool *ctx, + struct irq_affinity *desc) +{ + struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vdev); + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + struct vdpa_callback cb; + int i, err, queue_idx = 0; + + for (i = 0; i < nvqs; ++i) { + if (!names[i]) { + vqs[i] = NULL; + continue; + } + + vqs[i] = virtio_vdpa_setup_vq(vdev, queue_idx++, + callbacks[i], names[i], ctx ? + ctx[i] : false); + if (IS_ERR(vqs[i])) { + err = PTR_ERR(vqs[i]); + goto err_setup_vq; + } + } + + cb.callback = virtio_vdpa_config_cb; + cb.private = vd_dev; + ops->set_config_cb(vdpa, &cb); + + return 0; + +err_setup_vq: + virtio_vdpa_del_vqs(vdev); + return err; +} + +static u64 virtio_vdpa_get_features(struct virtio_device *vdev) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + return ops->get_features(vdpa); +} + +static int virtio_vdpa_finalize_features(struct virtio_device *vdev) +{ + struct vdpa_device *vdpa = vd_get_vdpa(vdev); + const struct vdpa_config_ops *ops = vdpa->config; + + /* Give virtio_ring a chance to accept features. */ + vring_transport_features(vdev); + + return ops->set_features(vdpa, vdev->features); +} + +static const char *virtio_vdpa_bus_name(struct virtio_device *vdev) +{ + struct virtio_vdpa_device *vd_dev = to_virtio_vdpa_device(vdev); + struct vdpa_device *vdpa = vd_dev->vdpa; + + return dev_name(&vdpa->dev); +} + +static const struct virtio_config_ops virtio_vdpa_config_ops = { + .get = virtio_vdpa_get, + .set = virtio_vdpa_set, + .generation = virtio_vdpa_generation, + .get_status = virtio_vdpa_get_status, + .set_status = virtio_vdpa_set_status, + .reset = virtio_vdpa_reset, + .find_vqs = virtio_vdpa_find_vqs, + .del_vqs = virtio_vdpa_del_vqs, + .get_features = virtio_vdpa_get_features, + .finalize_features = virtio_vdpa_finalize_features, + .bus_name = virtio_vdpa_bus_name, +}; + +static void virtio_vdpa_release_dev(struct device *_d) +{ + struct virtio_device *vdev = + container_of(_d, struct virtio_device, dev); + struct virtio_vdpa_device *vd_dev = + container_of(vdev, struct virtio_vdpa_device, vdev); + + kfree(vd_dev); +} + +static int virtio_vdpa_probe(struct vdpa_device *vdpa) +{ + const struct vdpa_config_ops *ops = vdpa->config; + struct virtio_vdpa_device *vd_dev, *reg_dev = NULL; + int ret = -EINVAL; + + vd_dev = kzalloc(sizeof(*vd_dev), GFP_KERNEL); + if (!vd_dev) + return -ENOMEM; + + vd_dev->vdev.dev.parent = vdpa_get_dma_dev(vdpa); + vd_dev->vdev.dev.release = virtio_vdpa_release_dev; + vd_dev->vdev.config = &virtio_vdpa_config_ops; + vd_dev->vdpa = vdpa; + INIT_LIST_HEAD(&vd_dev->virtqueues); + spin_lock_init(&vd_dev->lock); + + vd_dev->vdev.id.device = ops->get_device_id(vdpa); + if (vd_dev->vdev.id.device == 0) + goto err; + + vd_dev->vdev.id.vendor = ops->get_vendor_id(vdpa); + ret = register_virtio_device(&vd_dev->vdev); + reg_dev = vd_dev; + if (ret) + goto err; + + vdpa_set_drvdata(vdpa, vd_dev); + + return 0; + +err: + if (reg_dev) + put_device(&vd_dev->vdev.dev); + else + kfree(vd_dev); + return ret; +} + +static void virtio_vdpa_remove(struct vdpa_device *vdpa) +{ + struct virtio_vdpa_device *vd_dev = vdpa_get_drvdata(vdpa); + + unregister_virtio_device(&vd_dev->vdev); +} + +static struct vdpa_driver virtio_vdpa_driver = { + .driver = { + .name = "virtio_vdpa", + }, + .probe = virtio_vdpa_probe, + .remove = virtio_vdpa_remove, +}; + +module_vdpa_driver(virtio_vdpa_driver); + +MODULE_VERSION(MOD_VERSION); +MODULE_LICENSE(MOD_LICENSE); +MODULE_AUTHOR(MOD_AUTHOR); +MODULE_DESCRIPTION(MOD_DESC); From patchwork Wed Mar 25 08:27:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 221900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 261DEC54FD0 for ; Wed, 25 Mar 2020 08:34:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DFD9F206F6 for ; Wed, 25 Mar 2020 08:34:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EhxBh5Ab" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727490AbgCYIe5 (ORCPT ); Wed, 25 Mar 2020 04:34:57 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:22467 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727417AbgCYIe4 (ORCPT ); Wed, 25 Mar 2020 04:34:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585125293; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+ljbmH2K/cjgNEpO1oJFKm13YvpJQDebzMgsuNXU3kA=; b=EhxBh5AbKX9fWg64AliTy0bIJs5ngX9HSh79J/SI8w2kNBcZPbWP94wPTvD+Cm/5Mbtt9a EDgRpNZ5S44yvs/3z/1mkeU9gd/o14/Il/rtXj2tr72Rrw+jDQB3TaHZtWeZyfP/WwWdFv KzWulx7b0FvU8WYMPssPk6iDYTQ6AEI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-251-N2k-RHFuO32GsgozGReyUA-1; Wed, 25 Mar 2020 04:34:52 -0400 X-MC-Unique: N2k-RHFuO32GsgozGReyUA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8A661801E72; Wed, 25 Mar 2020 08:34:49 +0000 (UTC) Received: from jason-ThinkPad-X1-Carbon-6th.redhat.com (ovpn-14-13.pek2.redhat.com [10.72.14.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF44DA7F1; Wed, 25 Mar 2020 08:34:14 +0000 (UTC) From: Jason Wang To: mst@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Cc: jgg@mellanox.com, maxime.coquelin@redhat.com, cunming.liang@intel.com, zhihong.wang@intel.com, rob.miller@broadcom.com, xiao.w.wang@intel.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, parav@mellanox.com, kevin.tian@intel.com, stefanha@redhat.com, rdunlap@infradead.org, hch@infradead.org, aadam@redhat.com, jiri@mellanox.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, Jason Wang Subject: [PATCH V8 8/9] vdpasim: vDPA device simulator Date: Wed, 25 Mar 2020 16:27:10 +0800 Message-Id: <20200325082711.1107-9-jasowang@redhat.com> In-Reply-To: <20200325082711.1107-1-jasowang@redhat.com> References: <20200325082711.1107-1-jasowang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch implements a software vDPA networking device. The datapath is implemented through vringh and workqueue. The device has an on-chip IOMMU which translates IOVA to PA. For kernel virtio drivers, vDPA simulator driver provides dma_ops. For vhost driers, set_map() methods of vdpa_config_ops is implemented to accept mappings from vhost. Currently, vDPA device simulator will loopback TX traffic to RX. So the main use case for the device is vDPA feature testing, prototyping and development. Note, there's no management API implemented, a vDPA device will be registered once the module is probed. We need to handle this in the future development. Signed-off-by: Jason Wang --- drivers/virtio/vdpa/Kconfig | 19 + drivers/virtio/vdpa/Makefile | 1 + drivers/virtio/vdpa/vdpa_sim/Makefile | 2 + drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c | 629 ++++++++++++++++++++++++ 4 files changed, 651 insertions(+) create mode 100644 drivers/virtio/vdpa/vdpa_sim/Makefile create mode 100644 drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig index 351617723d12..ae204465964e 100644 --- a/drivers/virtio/vdpa/Kconfig +++ b/drivers/virtio/vdpa/Kconfig @@ -5,3 +5,22 @@ config VDPA Enable this module to support vDPA device that uses a datapath which complies with virtio specifications with vendor specific control path. + +menuconfig VDPA_MENU + bool "VDPA drivers" + default n + +if VDPA_MENU + +config VDPA_SIM + tristate "vDPA device simulator" + depends on RUNTIME_TESTING_MENU + select VDPA + select VHOST_RING + default n + help + vDPA networking device simulator which loop TX traffic back + to RX. This device is used for testing, prototyping and + development of vDPA. + +endif # VDPA_MENU diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile index ee6a35e8a4fb..3814af8e097b 100644 --- a/drivers/virtio/vdpa/Makefile +++ b/drivers/virtio/vdpa/Makefile @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o +obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ diff --git a/drivers/virtio/vdpa/vdpa_sim/Makefile b/drivers/virtio/vdpa/vdpa_sim/Makefile new file mode 100644 index 000000000000..b40278f65e04 --- /dev/null +++ b/drivers/virtio/vdpa/vdpa_sim/Makefile @@ -0,0 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_VDPA_SIM) += vdpa_sim.o diff --git a/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c new file mode 100644 index 000000000000..a13550a4c707 --- /dev/null +++ b/drivers/virtio/vdpa/vdpa_sim/vdpa_sim.c @@ -0,0 +1,629 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VDPA networking device simulator. + * + * Copyright (c) 2020, Red Hat Inc. All rights reserved. + * Author: Jason Wang + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRV_VERSION "0.1" +#define DRV_AUTHOR "Jason Wang " +#define DRV_DESC "vDPA Device Simulator" +#define DRV_LICENSE "GPL v2" + +struct vdpasim_virtqueue { + struct vringh vring; + struct vringh_kiov iov; + unsigned short head; + bool ready; + u64 desc_addr; + u64 device_addr; + u64 driver_addr; + u32 num; + void *private; + irqreturn_t (*cb)(void *data); +}; + +#define VDPASIM_QUEUE_ALIGN PAGE_SIZE +#define VDPASIM_QUEUE_MAX 256 +#define VDPASIM_DEVICE_ID 0x1 +#define VDPASIM_VENDOR_ID 0 +#define VDPASIM_VQ_NUM 0x2 +#define VDPASIM_NAME "vdpasim-netdev" + +static u64 vdpasim_features = (1ULL << VIRTIO_F_ANY_LAYOUT) | + (1ULL << VIRTIO_F_VERSION_1) | + (1ULL << VIRTIO_F_IOMMU_PLATFORM); + +/* State of each vdpasim device */ +struct vdpasim { + struct vdpa_device vdpa; + struct vdpasim_virtqueue vqs[2]; + struct work_struct work; + /* spinlock to synchronize virtqueue state */ + spinlock_t lock; + struct virtio_net_config config; + struct vhost_iotlb *iommu; + void *buffer; + u32 status; + u32 generation; + u64 features; +}; + +static struct vdpasim *vdpasim_dev; + +static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) +{ + return container_of(vdpa, struct vdpasim, vdpa); +} + +static struct vdpasim *dev_to_sim(struct device *dev) +{ + struct vdpa_device *vdpa = dev_to_vdpa(dev); + + return vdpa_to_sim(vdpa); +} + +static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx) +{ + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + int ret; + + ret = vringh_init_iotlb(&vq->vring, vdpasim_features, + VDPASIM_QUEUE_MAX, false, + (struct vring_desc *)(uintptr_t)vq->desc_addr, + (struct vring_avail *) + (uintptr_t)vq->driver_addr, + (struct vring_used *) + (uintptr_t)vq->device_addr); +} + +static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq) +{ + vq->ready = 0; + vq->desc_addr = 0; + vq->driver_addr = 0; + vq->device_addr = 0; + vq->cb = NULL; + vq->private = NULL; + vringh_init_iotlb(&vq->vring, vdpasim_features, VDPASIM_QUEUE_MAX, + false, NULL, NULL, NULL); +} + +static void vdpasim_reset(struct vdpasim *vdpasim) +{ + int i; + + for (i = 0; i < VDPASIM_VQ_NUM; i++) + vdpasim_vq_reset(&vdpasim->vqs[i]); + + vhost_iotlb_reset(vdpasim->iommu); + + vdpasim->features = 0; + vdpasim->status = 0; + ++vdpasim->generation; +} + +static void vdpasim_work(struct work_struct *work) +{ + struct vdpasim *vdpasim = container_of(work, struct + vdpasim, work); + struct vdpasim_virtqueue *txq = &vdpasim->vqs[1]; + struct vdpasim_virtqueue *rxq = &vdpasim->vqs[0]; + size_t read, write, total_write; + int err; + int pkts = 0; + + spin_lock(&vdpasim->lock); + + if (!(vdpasim->status & VIRTIO_CONFIG_S_DRIVER_OK)) + goto out; + + if (!txq->ready || !rxq->ready) + goto out; + + while (true) { + total_write = 0; + err = vringh_getdesc_iotlb(&txq->vring, &txq->iov, NULL, + &txq->head, GFP_ATOMIC); + if (err <= 0) + break; + + err = vringh_getdesc_iotlb(&rxq->vring, NULL, &rxq->iov, + &rxq->head, GFP_ATOMIC); + if (err <= 0) { + vringh_complete_iotlb(&txq->vring, txq->head, 0); + break; + } + + while (true) { + read = vringh_iov_pull_iotlb(&txq->vring, &txq->iov, + vdpasim->buffer, + PAGE_SIZE); + if (read <= 0) + break; + + write = vringh_iov_push_iotlb(&rxq->vring, &rxq->iov, + vdpasim->buffer, read); + if (write <= 0) + break; + + total_write += write; + } + + /* Make sure data is wrote before advancing index */ + smp_wmb(); + + vringh_complete_iotlb(&txq->vring, txq->head, 0); + vringh_complete_iotlb(&rxq->vring, rxq->head, total_write); + + /* Make sure used is visible before rasing the interrupt. */ + smp_wmb(); + + local_bh_disable(); + if (txq->cb) + txq->cb(txq->private); + if (rxq->cb) + rxq->cb(rxq->private); + local_bh_enable(); + + if (++pkts > 4) { + schedule_work(&vdpasim->work); + goto out; + } + } + +out: + spin_unlock(&vdpasim->lock); +} + +static int dir_to_perm(enum dma_data_direction dir) +{ + int perm = -EFAULT; + + switch (dir) { + case DMA_FROM_DEVICE: + perm = VHOST_MAP_WO; + break; + case DMA_TO_DEVICE: + perm = VHOST_MAP_RO; + break; + case DMA_BIDIRECTIONAL: + perm = VHOST_MAP_RW; + break; + default: + break; + } + + return perm; +} + +static dma_addr_t vdpasim_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + struct vdpasim *vdpasim = dev_to_sim(dev); + struct vhost_iotlb *iommu = vdpasim->iommu; + u64 pa = (page_to_pfn(page) << PAGE_SHIFT) + offset; + int ret, perm = dir_to_perm(dir); + + if (perm < 0) + return DMA_MAPPING_ERROR; + + /* For simplicity, use identical mapping to avoid e.g iova + * allocator. + */ + ret = vhost_iotlb_add_range(iommu, pa, pa + size - 1, + pa, dir_to_perm(dir)); + if (ret) + return DMA_MAPPING_ERROR; + + return (dma_addr_t)(pa); +} + +static void vdpasim_unmap_page(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct vdpasim *vdpasim = dev_to_sim(dev); + struct vhost_iotlb *iommu = vdpasim->iommu; + + vhost_iotlb_del_range(iommu, (u64)dma_addr, + (u64)dma_addr + size - 1); +} + +static void *vdpasim_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_addr, gfp_t flag, + unsigned long attrs) +{ + struct vdpasim *vdpasim = dev_to_sim(dev); + struct vhost_iotlb *iommu = vdpasim->iommu; + void *addr = kmalloc(size, flag); + int ret; + + if (!addr) + *dma_addr = DMA_MAPPING_ERROR; + else { + u64 pa = virt_to_phys(addr); + + ret = vhost_iotlb_add_range(iommu, (u64)pa, + (u64)pa + size - 1, + pa, VHOST_MAP_RW); + if (ret) { + *dma_addr = DMA_MAPPING_ERROR; + kfree(addr); + addr = NULL; + } else + *dma_addr = (dma_addr_t)pa; + } + + return addr; +} + +static void vdpasim_free_coherent(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct vdpasim *vdpasim = dev_to_sim(dev); + struct vhost_iotlb *iommu = vdpasim->iommu; + + vhost_iotlb_del_range(iommu, (u64)dma_addr, + (u64)dma_addr + size - 1); + kfree(phys_to_virt((uintptr_t)dma_addr)); +} + +static const struct dma_map_ops vdpasim_dma_ops = { + .map_page = vdpasim_map_page, + .unmap_page = vdpasim_unmap_page, + .alloc = vdpasim_alloc_coherent, + .free = vdpasim_free_coherent, +}; + +static const struct vdpa_config_ops vdpasim_net_config_ops; + +static struct vdpasim *vdpasim_create(void) +{ + struct virtio_net_config *config; + struct vdpasim *vdpasim; + struct device *dev; + int ret = -ENOMEM; + + vdpasim = vdpa_alloc_device(vdpasim, vdpa, NULL, + &vdpasim_net_config_ops); + if (!vdpasim) + goto err_alloc; + + INIT_WORK(&vdpasim->work, vdpasim_work); + spin_lock_init(&vdpasim->lock); + + dev = &vdpasim->vdpa.dev; + dev->coherent_dma_mask = DMA_BIT_MASK(64); + set_dma_ops(dev, &vdpasim_dma_ops); + + vdpasim->iommu = vhost_iotlb_alloc(2048, 0); + if (!vdpasim->iommu) + goto err_iommu; + + vdpasim->buffer = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (!vdpasim->buffer) + goto err_iommu; + + config = &vdpasim->config; + config->mtu = 1500; + config->status = VIRTIO_NET_S_LINK_UP; + eth_random_addr(config->mac); + + vringh_set_iotlb(&vdpasim->vqs[0].vring, vdpasim->iommu); + vringh_set_iotlb(&vdpasim->vqs[1].vring, vdpasim->iommu); + + vdpasim->vdpa.dma_dev = dev; + ret = vdpa_register_device(&vdpasim->vdpa); + if (ret) + goto err_iommu; + + return vdpasim; + +err_iommu: + put_device(dev); +err_alloc: + return ERR_PTR(ret); +} + +static int vdpasim_set_vq_address(struct vdpa_device *vdpa, u16 idx, + u64 desc_area, u64 driver_area, + u64 device_area) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + vq->desc_addr = desc_area; + vq->driver_addr = driver_area; + vq->device_addr = device_area; + + return 0; +} + +static void vdpasim_set_vq_num(struct vdpa_device *vdpa, u16 idx, u32 num) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + vq->num = num; +} + +static void vdpasim_kick_vq(struct vdpa_device *vdpa, u16 idx) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + if (vq->ready) + schedule_work(&vdpasim->work); +} + +static void vdpasim_set_vq_cb(struct vdpa_device *vdpa, u16 idx, + struct vdpa_callback *cb) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + vq->cb = cb->callback; + vq->private = cb->private; +} + +static void vdpasim_set_vq_ready(struct vdpa_device *vdpa, u16 idx, bool ready) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + spin_lock(&vdpasim->lock); + vq->ready = ready; + if (vq->ready) + vdpasim_queue_ready(vdpasim, idx); + spin_unlock(&vdpasim->lock); +} + +static bool vdpasim_get_vq_ready(struct vdpa_device *vdpa, u16 idx) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + + return vq->ready; +} + +static int vdpasim_set_vq_state(struct vdpa_device *vdpa, u16 idx, u64 state) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + struct vringh *vrh = &vq->vring; + + spin_lock(&vdpasim->lock); + vrh->last_avail_idx = state; + spin_unlock(&vdpasim->lock); + + return 0; +} + +static u64 vdpasim_get_vq_state(struct vdpa_device *vdpa, u16 idx) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx]; + struct vringh *vrh = &vq->vring; + + return vrh->last_avail_idx; +} + +static u16 vdpasim_get_vq_align(struct vdpa_device *vdpa) +{ + return VDPASIM_QUEUE_ALIGN; +} + +static u64 vdpasim_get_features(struct vdpa_device *vdpa) +{ + return vdpasim_features; +} + +static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + /* DMA mapping must be done by driver */ + if (!(features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))) + return -EINVAL; + + vdpasim->features = features & vdpasim_features; + + return 0; +} + +static void vdpasim_set_config_cb(struct vdpa_device *vdpa, + struct vdpa_callback *cb) +{ + /* We don't support config interrupt */ +} + +static u16 vdpasim_get_vq_num_max(struct vdpa_device *vdpa) +{ + return VDPASIM_QUEUE_MAX; +} + +static u32 vdpasim_get_device_id(struct vdpa_device *vdpa) +{ + return VDPASIM_DEVICE_ID; +} + +static u32 vdpasim_get_vendor_id(struct vdpa_device *vdpa) +{ + return VDPASIM_VENDOR_ID; +} + +static u8 vdpasim_get_status(struct vdpa_device *vdpa) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + u8 status; + + spin_lock(&vdpasim->lock); + status = vdpasim->status; + spin_unlock(&vdpasim->lock); + + return vdpasim->status; +} + +static void vdpasim_set_status(struct vdpa_device *vdpa, u8 status) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + spin_lock(&vdpasim->lock); + vdpasim->status = status; + if (status == 0) + vdpasim_reset(vdpasim); + spin_unlock(&vdpasim->lock); +} + +static void vdpasim_get_config(struct vdpa_device *vdpa, unsigned int offset, + void *buf, unsigned int len) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + if (offset + len < sizeof(struct virtio_net_config)) + memcpy(buf, &vdpasim->config + offset, len); +} + +static void vdpasim_set_config(struct vdpa_device *vdpa, unsigned int offset, + const void *buf, unsigned int len) +{ + /* No writable config supportted by vdpasim */ +} + +static u32 vdpasim_get_generation(struct vdpa_device *vdpa) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + return vdpasim->generation; +} + +static int vdpasim_set_map(struct vdpa_device *vdpa, + struct vhost_iotlb *iotlb) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + struct vhost_iotlb_map *map; + u64 start = 0ULL, last = 0ULL - 1; + int ret; + + vhost_iotlb_reset(vdpasim->iommu); + + for (map = vhost_iotlb_itree_first(iotlb, start, last); map; + map = vhost_iotlb_itree_next(map, start, last)) { + ret = vhost_iotlb_add_range(vdpasim->iommu, map->start, + map->last, map->addr, map->perm); + if (ret) + goto err; + } + return 0; + +err: + vhost_iotlb_reset(vdpasim->iommu); + return ret; +} + +static int vdpasim_dma_map(struct vdpa_device *vdpa, u64 iova, u64 size, + u64 pa, u32 perm) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + return vhost_iotlb_add_range(vdpasim->iommu, iova, + iova + size - 1, pa, perm); +} + +static int vdpasim_dma_unmap(struct vdpa_device *vdpa, u64 iova, u64 size) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + vhost_iotlb_del_range(vdpasim->iommu, iova, iova + size - 1); + + return 0; +} + +static void vdpasim_free(struct vdpa_device *vdpa) +{ + struct vdpasim *vdpasim = vdpa_to_sim(vdpa); + + cancel_work_sync(&vdpasim->work); + kfree(vdpasim->buffer); + if (vdpasim->iommu) + vhost_iotlb_free(vdpasim->iommu); +} + +static const struct vdpa_config_ops vdpasim_net_config_ops = { + .set_vq_address = vdpasim_set_vq_address, + .set_vq_num = vdpasim_set_vq_num, + .kick_vq = vdpasim_kick_vq, + .set_vq_cb = vdpasim_set_vq_cb, + .set_vq_ready = vdpasim_set_vq_ready, + .get_vq_ready = vdpasim_get_vq_ready, + .set_vq_state = vdpasim_set_vq_state, + .get_vq_state = vdpasim_get_vq_state, + .get_vq_align = vdpasim_get_vq_align, + .get_features = vdpasim_get_features, + .set_features = vdpasim_set_features, + .set_config_cb = vdpasim_set_config_cb, + .get_vq_num_max = vdpasim_get_vq_num_max, + .get_device_id = vdpasim_get_device_id, + .get_vendor_id = vdpasim_get_vendor_id, + .get_status = vdpasim_get_status, + .set_status = vdpasim_set_status, + .get_config = vdpasim_get_config, + .set_config = vdpasim_set_config, + .get_generation = vdpasim_get_generation, + .set_map = vdpasim_set_map, + .dma_map = vdpasim_dma_map, + .dma_unmap = vdpasim_dma_unmap, + .free = vdpasim_free, +}; + +static int __init vdpasim_dev_init(void) +{ + vdpasim_dev = vdpasim_create(); + + if (!IS_ERR(vdpasim_dev)) + return 0; + + return PTR_ERR(vdpasim_dev); +} + +static void __exit vdpasim_dev_exit(void) +{ + struct vdpa_device *vdpa = &vdpasim_dev->vdpa; + + vdpa_unregister_device(vdpa); +} + +module_init(vdpasim_dev_init) +module_exit(vdpasim_dev_exit) + +MODULE_VERSION(DRV_VERSION); +MODULE_LICENSE(DRV_LICENSE); +MODULE_AUTHOR(DRV_AUTHOR); +MODULE_DESCRIPTION(DRV_DESC); From patchwork Wed Mar 25 08:27:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Wang X-Patchwork-Id: 221899 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2E8BC18E5B for ; Wed, 25 Mar 2020 08:35:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6B41420714 for ; Wed, 25 Mar 2020 08:35:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EpoBJhaP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727384AbgCYIfl (ORCPT ); Wed, 25 Mar 2020 04:35:41 -0400 Received: from us-smtp-delivery-74.mimecast.com ([216.205.24.74]:51643 "EHLO us-smtp-delivery-74.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726262AbgCYIfl (ORCPT ); Wed, 25 Mar 2020 04:35:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1585125338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9BCm20/LhM7ziXmPh/LD14teqrl2A8mOAUU8oXvomwc=; b=EpoBJhaPGq9Z4d+/uCxjXZ4MDLChQcaCE3pKvisJo5kf87XYevCHXzes9GGx7Ef/AGwwCc vHRCxASgwIJokCDqfcGmpoHT4kMsZu+ZSseAXKkRJJOUNdLfB9W9wDXvXGVve13vFRCNsp E8Jll/761TDuerNU24axIglBPuNVeKg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-391-WFlotYpcPISjdeC973gvyQ-1; Wed, 25 Mar 2020 04:35:35 -0400 X-MC-Unique: WFlotYpcPISjdeC973gvyQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A5598010C8; Wed, 25 Mar 2020 08:35:32 +0000 (UTC) Received: from jason-ThinkPad-X1-Carbon-6th.redhat.com (ovpn-14-13.pek2.redhat.com [10.72.14.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3EBC2A7F1; Wed, 25 Mar 2020 08:34:51 +0000 (UTC) From: Jason Wang To: mst@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org Cc: jgg@mellanox.com, maxime.coquelin@redhat.com, cunming.liang@intel.com, zhihong.wang@intel.com, rob.miller@broadcom.com, xiao.w.wang@intel.com, lingshan.zhu@intel.com, eperezma@redhat.com, lulu@redhat.com, parav@mellanox.com, kevin.tian@intel.com, stefanha@redhat.com, rdunlap@infradead.org, hch@infradead.org, aadam@redhat.com, jiri@mellanox.com, shahafs@mellanox.com, hanand@xilinx.com, mhabets@solarflare.com, gdawar@xilinx.com, saugatm@xilinx.com, vmireyno@marvell.com, Bie Tiwei , Jason Wang Subject: [PATCH V8 9/9] virtio: Intel IFC VF driver for VDPA Date: Wed, 25 Mar 2020 16:27:11 +0800 Message-Id: <20200325082711.1107-10-jasowang@redhat.com> In-Reply-To: <20200325082711.1107-1-jasowang@redhat.com> References: <20200325082711.1107-1-jasowang@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Zhu Lingshan This commit introduced two layers to drive IFC VF: (1) ifcvf_base layer, which handles IFC VF NIC hardware operations and configurations. (2) ifcvf_main layer, which complies to VDPA bus framework, implemented device operations for VDPA bus, handles device probe, bus attaching, vring operations, etc. Signed-off-by: Zhu Lingshan Signed-off-by: Bie Tiwei Signed-off-by: Wang Xiao Signed-off-by: Jason Wang --- drivers/virtio/vdpa/Kconfig | 11 + drivers/virtio/vdpa/Makefile | 1 + drivers/virtio/vdpa/ifcvf/Makefile | 3 + drivers/virtio/vdpa/ifcvf/ifcvf_base.c | 386 ++++++++++++++++++++ drivers/virtio/vdpa/ifcvf/ifcvf_base.h | 132 +++++++ drivers/virtio/vdpa/ifcvf/ifcvf_main.c | 474 +++++++++++++++++++++++++ 6 files changed, 1007 insertions(+) create mode 100644 drivers/virtio/vdpa/ifcvf/Makefile create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_base.c create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_base.h create mode 100644 drivers/virtio/vdpa/ifcvf/ifcvf_main.c diff --git a/drivers/virtio/vdpa/Kconfig b/drivers/virtio/vdpa/Kconfig index ae204465964e..7db1460104b7 100644 --- a/drivers/virtio/vdpa/Kconfig +++ b/drivers/virtio/vdpa/Kconfig @@ -23,4 +23,15 @@ config VDPA_SIM to RX. This device is used for testing, prototyping and development of vDPA. +config IFCVF + tristate "Intel IFC VF VDPA driver" + depends on PCI_MSI + select VDPA + default n + help + This kernel module can drive Intel IFC VF NIC to offload + virtio dataplane traffic to hardware. + To compile this driver as a module, choose M here: the module will + be called ifcvf. + endif # VDPA_MENU diff --git a/drivers/virtio/vdpa/Makefile b/drivers/virtio/vdpa/Makefile index 3814af8e097b..8bbb686ca7a2 100644 --- a/drivers/virtio/vdpa/Makefile +++ b/drivers/virtio/vdpa/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ +obj-$(CONFIG_IFCVF) += ifcvf/ diff --git a/drivers/virtio/vdpa/ifcvf/Makefile b/drivers/virtio/vdpa/ifcvf/Makefile new file mode 100644 index 000000000000..d709915995ab --- /dev/null +++ b/drivers/virtio/vdpa/ifcvf/Makefile @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +obj-$(CONFIG_IFCVF) += ifcvf.o +ifcvf-$(CONFIG_IFCVF) += ifcvf_main.o ifcvf_base.o diff --git a/drivers/virtio/vdpa/ifcvf/ifcvf_base.c b/drivers/virtio/vdpa/ifcvf/ifcvf_base.c new file mode 100644 index 000000000000..630e0aad9d05 --- /dev/null +++ b/drivers/virtio/vdpa/ifcvf/ifcvf_base.c @@ -0,0 +1,386 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Intel IFC VF NIC driver for virtio dataplane offloading + * + * Copyright (C) 2020 Intel Corporation. + * + * Author: Zhu Lingshan + * + */ + +#include "ifcvf_base.h" + +static inline u8 ifc_ioread8(u8 __iomem *addr) +{ + return ioread8(addr); +} +static inline u16 ifc_ioread16 (__le16 __iomem *addr) +{ + return ioread16(addr); +} + +static inline u32 ifc_ioread32(__le32 __iomem *addr) +{ + return ioread32(addr); +} + +static inline void ifc_iowrite8(u8 value, u8 __iomem *addr) +{ + iowrite8(value, addr); +} + +static inline void ifc_iowrite16(u16 value, __le16 __iomem *addr) +{ + iowrite16(value, addr); +} + +static inline void ifc_iowrite32(u32 value, __le32 __iomem *addr) +{ + iowrite32(value, addr); +} + +static void ifc_iowrite64_twopart(u64 val, + __le32 __iomem *lo, __le32 __iomem *hi) +{ + ifc_iowrite32((u32)val, lo); + ifc_iowrite32(val >> 32, hi); +} + +struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw) +{ + return container_of(hw, struct ifcvf_adapter, vf); +} + +static void __iomem *get_cap_addr(struct ifcvf_hw *hw, + struct virtio_pci_cap *cap) +{ + struct ifcvf_adapter *ifcvf; + u32 length, offset; + u8 bar; + + length = le32_to_cpu(cap->length); + offset = le32_to_cpu(cap->offset); + bar = cap->bar; + + ifcvf = vf_to_adapter(hw); + if (bar >= IFCVF_PCI_MAX_RESOURCE) { + IFCVF_DBG(ifcvf->dev, + "Invalid bar number %u to get capabilities\n", bar); + return NULL; + } + + if (offset + length > hw->mem_resource[bar].len) { + IFCVF_DBG(ifcvf->dev, + "offset(%u) + len(%u) overflows bar%u to get capabilities\n", + offset, length, bar); + return NULL; + } + + return hw->mem_resource[bar].addr + offset; +} + +static int ifcvf_read_config_range(struct pci_dev *dev, + uint32_t *val, int size, int where) +{ + int ret, i; + + for (i = 0; i < size; i += 4) { + ret = pci_read_config_dword(dev, where + i, val + i / 4); + if (ret < 0) + return ret; + } + + return 0; +} + +int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev) +{ + struct virtio_pci_cap cap; + u16 notify_off; + int ret; + u8 pos; + u32 i; + + ret = pci_read_config_byte(dev, PCI_CAPABILITY_LIST, &pos); + if (ret < 0) { + IFCVF_ERR(&dev->dev, "Failed to read PCI capability list\n"); + return -EIO; + } + + while (pos) { + ret = ifcvf_read_config_range(dev, (u32 *)&cap, + sizeof(cap), pos); + if (ret < 0) { + IFCVF_ERR(&dev->dev, + "Failed to get PCI capability at %x\n", pos); + break; + } + + if (cap.cap_vndr != PCI_CAP_ID_VNDR) + goto next; + + switch (cap.cfg_type) { + case VIRTIO_PCI_CAP_COMMON_CFG: + hw->common_cfg = get_cap_addr(hw, &cap); + IFCVF_DBG(&dev->dev, "hw->common_cfg = %p\n", + hw->common_cfg); + break; + case VIRTIO_PCI_CAP_NOTIFY_CFG: + pci_read_config_dword(dev, pos + sizeof(cap), + &hw->notify_off_multiplier); + hw->notify_bar = cap.bar; + hw->notify_base = get_cap_addr(hw, &cap); + IFCVF_DBG(&dev->dev, "hw->notify_base = %p\n", + hw->notify_base); + break; + case VIRTIO_PCI_CAP_ISR_CFG: + hw->isr = get_cap_addr(hw, &cap); + IFCVF_DBG(&dev->dev, "hw->isr = %p\n", hw->isr); + break; + case VIRTIO_PCI_CAP_DEVICE_CFG: + hw->net_cfg = get_cap_addr(hw, &cap); + IFCVF_DBG(&dev->dev, "hw->net_cfg = %p\n", hw->net_cfg); + break; + } + +next: + pos = cap.cap_next; + } + + if (hw->common_cfg == NULL || hw->notify_base == NULL || + hw->isr == NULL || hw->net_cfg == NULL) { + IFCVF_ERR(&dev->dev, "Incomplete PCI capabilities\n"); + return -EIO; + } + + for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { + ifc_iowrite16(i, &hw->common_cfg->queue_select); + notify_off = ifc_ioread16(&hw->common_cfg->queue_notify_off); + hw->vring[i].notify_addr = hw->notify_base + + notify_off * hw->notify_off_multiplier; + } + + hw->lm_cfg = hw->mem_resource[IFCVF_LM_BAR].addr; + + IFCVF_DBG(&dev->dev, + "PCI capability mapping: common cfg: %p, notify base: %p\n, isr cfg: %p, device cfg: %p, multiplier: %u\n", + hw->common_cfg, hw->notify_base, hw->isr, + hw->net_cfg, hw->notify_off_multiplier); + + return 0; +} + +u8 ifcvf_get_status(struct ifcvf_hw *hw) +{ + return ifc_ioread8(&hw->common_cfg->device_status); +} + +void ifcvf_set_status(struct ifcvf_hw *hw, u8 status) +{ + ifc_iowrite8(status, &hw->common_cfg->device_status); +} + +void ifcvf_reset(struct ifcvf_hw *hw) +{ + ifcvf_set_status(hw, 0); + /* flush set_status, make sure VF is stopped, reset */ + ifcvf_get_status(hw); +} + +static void ifcvf_add_status(struct ifcvf_hw *hw, u8 status) +{ + if (status != 0) + status |= ifcvf_get_status(hw); + + ifcvf_set_status(hw, status); + ifcvf_get_status(hw); +} + +u64 ifcvf_get_features(struct ifcvf_hw *hw) +{ + struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg; + u32 features_lo, features_hi; + + ifc_iowrite32(0, &cfg->device_feature_select); + features_lo = ifc_ioread32(&cfg->device_feature); + + ifc_iowrite32(1, &cfg->device_feature_select); + features_hi = ifc_ioread32(&cfg->device_feature); + + return ((u64)features_hi << 32) | features_lo; +} + +void ifcvf_read_net_config(struct ifcvf_hw *hw, u64 offset, + void *dst, int length) +{ + u8 old_gen, new_gen, *p; + int i; + + WARN_ON(offset + length > sizeof(struct virtio_net_config)); + do { + old_gen = ifc_ioread8(&hw->common_cfg->config_generation); + p = dst; + for (i = 0; i < length; i++) + *p++ = ifc_ioread8(hw->net_cfg + offset + i); + + new_gen = ifc_ioread8(&hw->common_cfg->config_generation); + } while (old_gen != new_gen); +} + +void ifcvf_write_net_config(struct ifcvf_hw *hw, u64 offset, + const void *src, int length) +{ + const u8 *p; + int i; + + p = src; + WARN_ON(offset + length > sizeof(struct virtio_net_config)); + for (i = 0; i < length; i++) + ifc_iowrite8(*p++, hw->net_cfg + offset + i); +} + +static void ifcvf_set_features(struct ifcvf_hw *hw, u64 features) +{ + struct virtio_pci_common_cfg __iomem *cfg = hw->common_cfg; + + ifc_iowrite32(0, &cfg->guest_feature_select); + ifc_iowrite32((u32)features, &cfg->guest_feature); + + ifc_iowrite32(1, &cfg->guest_feature_select); + ifc_iowrite32(features >> 32, &cfg->guest_feature); +} + +static int ifcvf_config_features(struct ifcvf_hw *hw) +{ + struct ifcvf_adapter *ifcvf; + + ifcvf = vf_to_adapter(hw); + ifcvf_set_features(hw, hw->req_features); + ifcvf_add_status(hw, VIRTIO_CONFIG_S_FEATURES_OK); + + if (!(ifcvf_get_status(hw) & VIRTIO_CONFIG_S_FEATURES_OK)) { + IFCVF_ERR(ifcvf->dev, "Failed to set FEATURES_OK status\n"); + return -EIO; + } + + return 0; +} + +u64 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid) +{ + struct ifcvf_lm_cfg __iomem *ifcvf_lm; + void __iomem *avail_idx_addr; + u16 last_avail_idx; + u32 q_pair_id; + + ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg; + q_pair_id = qid / (IFCVF_MAX_QUEUE_PAIRS * 2); + avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2]; + last_avail_idx = ifc_ioread16(avail_idx_addr); + + return last_avail_idx; +} + +int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u64 num) +{ + struct ifcvf_lm_cfg __iomem *ifcvf_lm; + void __iomem *avail_idx_addr; + u32 q_pair_id; + + ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg; + q_pair_id = qid / (IFCVF_MAX_QUEUE_PAIRS * 2); + avail_idx_addr = &ifcvf_lm->vring_lm_cfg[q_pair_id].idx_addr[qid % 2]; + hw->vring[qid].last_avail_idx = num; + ifc_iowrite16(num, avail_idx_addr); + + return 0; +} + +static int ifcvf_hw_enable(struct ifcvf_hw *hw) +{ + struct ifcvf_lm_cfg __iomem *ifcvf_lm; + struct virtio_pci_common_cfg __iomem *cfg; + struct ifcvf_adapter *ifcvf; + u32 i; + + ifcvf_lm = (struct ifcvf_lm_cfg __iomem *)hw->lm_cfg; + ifcvf = vf_to_adapter(hw); + cfg = hw->common_cfg; + ifc_iowrite16(IFCVF_MSI_CONFIG_OFF, &cfg->msix_config); + + if (ifc_ioread16(&cfg->msix_config) == VIRTIO_MSI_NO_VECTOR) { + IFCVF_ERR(ifcvf->dev, "No msix vector for device config\n"); + return -EINVAL; + } + + for (i = 0; i < hw->nr_vring; i++) { + if (!hw->vring[i].ready) + break; + + ifc_iowrite16(i, &cfg->queue_select); + ifc_iowrite64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo, + &cfg->queue_desc_hi); + ifc_iowrite64_twopart(hw->vring[i].avail, &cfg->queue_avail_lo, + &cfg->queue_avail_hi); + ifc_iowrite64_twopart(hw->vring[i].used, &cfg->queue_used_lo, + &cfg->queue_used_hi); + ifc_iowrite16(hw->vring[i].size, &cfg->queue_size); + ifc_iowrite16(i + IFCVF_MSI_QUEUE_OFF, &cfg->queue_msix_vector); + + if (ifc_ioread16(&cfg->queue_msix_vector) == + VIRTIO_MSI_NO_VECTOR) { + IFCVF_ERR(ifcvf->dev, + "No msix vector for queue %u\n", i); + return -EINVAL; + } + + ifcvf_set_vq_state(hw, i, hw->vring[i].last_avail_idx); + ifc_iowrite16(1, &cfg->queue_enable); + } + + return 0; +} + +static void ifcvf_hw_disable(struct ifcvf_hw *hw) +{ + struct virtio_pci_common_cfg __iomem *cfg; + u32 i; + + cfg = hw->common_cfg; + ifc_iowrite16(VIRTIO_MSI_NO_VECTOR, &cfg->msix_config); + + for (i = 0; i < hw->nr_vring; i++) { + ifc_iowrite16(i, &cfg->queue_select); + ifc_iowrite16(VIRTIO_MSI_NO_VECTOR, &cfg->queue_msix_vector); + } + + ifc_ioread16(&cfg->queue_msix_vector); +} + +int ifcvf_start_hw(struct ifcvf_hw *hw) +{ + ifcvf_reset(hw); + ifcvf_add_status(hw, VIRTIO_CONFIG_S_ACKNOWLEDGE); + ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER); + + if (ifcvf_config_features(hw) < 0) + return -EINVAL; + + if (ifcvf_hw_enable(hw) < 0) + return -EINVAL; + + ifcvf_add_status(hw, VIRTIO_CONFIG_S_DRIVER_OK); + + return 0; +} + +void ifcvf_stop_hw(struct ifcvf_hw *hw) +{ + ifcvf_hw_disable(hw); + ifcvf_reset(hw); +} + +void ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid) +{ + ifc_iowrite16(qid, hw->vring[qid].notify_addr); +} diff --git a/drivers/virtio/vdpa/ifcvf/ifcvf_base.h b/drivers/virtio/vdpa/ifcvf/ifcvf_base.h new file mode 100644 index 000000000000..83bbdca5c089 --- /dev/null +++ b/drivers/virtio/vdpa/ifcvf/ifcvf_base.h @@ -0,0 +1,132 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Intel IFC VF NIC driver for virtio dataplane offloading + * + * Copyright (C) 2020 Intel Corporation. + * + * Author: Zhu Lingshan + * + */ + +#ifndef _IFCVF_H_ +#define _IFCVF_H_ + +#include +#include +#include +#include +#include +#include + +#define IFCVF_VENDOR_ID 0x1AF4 +#define IFCVF_DEVICE_ID 0x1041 +#define IFCVF_SUBSYS_VENDOR_ID 0x8086 +#define IFCVF_SUBSYS_DEVICE_ID 0x001A + +#define IFCVF_SUPPORTED_FEATURES \ + ((1ULL << VIRTIO_NET_F_MAC) | \ + (1ULL << VIRTIO_F_ANY_LAYOUT) | \ + (1ULL << VIRTIO_F_VERSION_1) | \ + (1ULL << VIRTIO_F_ORDER_PLATFORM) | \ + (1ULL << VIRTIO_F_IOMMU_PLATFORM) | \ + (1ULL << VIRTIO_NET_F_MRG_RXBUF)) + +/* Only one queue pair for now. */ +#define IFCVF_MAX_QUEUE_PAIRS 1 + +#define IFCVF_QUEUE_ALIGNMENT PAGE_SIZE +#define IFCVF_QUEUE_MAX 32768 +#define IFCVF_MSI_CONFIG_OFF 0 +#define IFCVF_MSI_QUEUE_OFF 1 +#define IFCVF_PCI_MAX_RESOURCE 6 + +#define IFCVF_LM_CFG_SIZE 0x40 +#define IFCVF_LM_RING_STATE_OFFSET 0x20 +#define IFCVF_LM_BAR 4 + +#define IFCVF_ERR(dev, fmt, ...) dev_err(dev, fmt, ##__VA_ARGS__) +#define IFCVF_DBG(dev, fmt, ...) dev_dbg(dev, fmt, ##__VA_ARGS__) +#define IFCVF_INFO(dev, fmt, ...) dev_info(dev, fmt, ##__VA_ARGS__) + +#define ifcvf_private_to_vf(adapter) \ + (&((struct ifcvf_adapter *)adapter)->vf) + +#define IFCVF_MAX_INTR (IFCVF_MAX_QUEUE_PAIRS * 2 + 1) + +struct ifcvf_pci_mem_resource { + u64 phys_addr; + u64 len; + /* Virtual address, NULL when not mapped. */ + void __iomem *addr; +}; + +struct vring_info { + u64 desc; + u64 avail; + u64 used; + u16 size; + u16 last_avail_idx; + bool ready; + void __iomem *notify_addr; + u32 irq; + struct vdpa_callback cb; + char msix_name[256]; +}; + +struct ifcvf_hw { + u8 __iomem *isr; + /* Live migration */ + u8 __iomem *lm_cfg; + u16 nr_vring; + /* Notification bar number */ + u8 notify_bar; + /* Notificaiton bar address */ + void __iomem *notify_base; + u32 notify_off_multiplier; + u64 req_features; + struct virtio_pci_common_cfg __iomem *common_cfg; + void __iomem *net_cfg; + struct vring_info vring[IFCVF_MAX_QUEUE_PAIRS * 2]; + struct ifcvf_pci_mem_resource mem_resource[IFCVF_PCI_MAX_RESOURCE]; +}; + +struct ifcvf_adapter { + struct vdpa_device vdpa; + struct device *dev; + struct ifcvf_hw vf; +}; + +struct ifcvf_vring_lm_cfg { + u32 idx_addr[2]; + u8 reserved[IFCVF_LM_CFG_SIZE - 8]; +}; + +struct ifcvf_lm_cfg { + u8 reserved[IFCVF_LM_RING_STATE_OFFSET]; + struct ifcvf_vring_lm_cfg vring_lm_cfg[IFCVF_MAX_QUEUE_PAIRS]; +}; + +struct vdpa_ifcvf_dev { + struct class *vd_class; + struct idr vd_idr; + struct device dev; + struct kobject *devices_kobj; +}; + +int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev); +int ifcvf_start_hw(struct ifcvf_hw *hw); +void ifcvf_stop_hw(struct ifcvf_hw *hw); +void ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid); +void ifcvf_read_net_config(struct ifcvf_hw *hw, u64 offset, + void *dst, int length); +void ifcvf_write_net_config(struct ifcvf_hw *hw, u64 offset, + const void *src, int length); +u8 ifcvf_get_status(struct ifcvf_hw *hw); +void ifcvf_set_status(struct ifcvf_hw *hw, u8 status); +void io_write64_twopart(u64 val, u32 *lo, u32 *hi); +void ifcvf_reset(struct ifcvf_hw *hw); +u64 ifcvf_get_features(struct ifcvf_hw *hw); +u64 ifcvf_get_vq_state(struct ifcvf_hw *hw, u16 qid); +int ifcvf_set_vq_state(struct ifcvf_hw *hw, u16 qid, u64 num); +struct ifcvf_adapter *vf_to_adapter(struct ifcvf_hw *hw); +#endif /* _IFCVF_H_ */ diff --git a/drivers/virtio/vdpa/ifcvf/ifcvf_main.c b/drivers/virtio/vdpa/ifcvf/ifcvf_main.c new file mode 100644 index 000000000000..157ac7ef49d6 --- /dev/null +++ b/drivers/virtio/vdpa/ifcvf/ifcvf_main.c @@ -0,0 +1,474 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Intel IFC VF NIC driver for virtio dataplane offloading + * + * Copyright (C) 2020 Intel Corporation. + * + * Author: Zhu Lingshan + * + */ + +#include +#include +#include +#include +#include "ifcvf_base.h" + +#define VERSION_STRING "0.1" +#define DRIVER_AUTHOR "Intel Corporation" +#define IFCVF_DRIVER_NAME "ifcvf" + +static irqreturn_t ifcvf_intr_handler(int irq, void *arg) +{ + struct vring_info *vring = arg; + + if (vring->cb.callback) + return vring->cb.callback(vring->cb.private); + + return IRQ_HANDLED; +} + +static int ifcvf_start_datapath(void *private) +{ + struct ifcvf_hw *vf = ifcvf_private_to_vf(private); + struct ifcvf_adapter *ifcvf; + u8 status; + int ret; + + ifcvf = vf_to_adapter(vf); + vf->nr_vring = IFCVF_MAX_QUEUE_PAIRS * 2; + ret = ifcvf_start_hw(vf); + if (ret < 0) { + status = ifcvf_get_status(vf); + status |= VIRTIO_CONFIG_S_FAILED; + ifcvf_set_status(vf, status); + } + + return ret; +} + +static int ifcvf_stop_datapath(void *private) +{ + struct ifcvf_hw *vf = ifcvf_private_to_vf(private); + int i; + + for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) + vf->vring[i].cb.callback = NULL; + + ifcvf_stop_hw(vf); + + return 0; +} + +static void ifcvf_reset_vring(struct ifcvf_adapter *adapter) +{ + struct ifcvf_hw *vf = ifcvf_private_to_vf(adapter); + int i; + + for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { + vf->vring[i].last_avail_idx = 0; + vf->vring[i].desc = 0; + vf->vring[i].avail = 0; + vf->vring[i].used = 0; + vf->vring[i].ready = 0; + vf->vring[i].cb.callback = NULL; + vf->vring[i].cb.private = NULL; + } + + ifcvf_reset(vf); +} + +static struct ifcvf_adapter *vdpa_to_adapter(struct vdpa_device *vdpa_dev) +{ + return container_of(vdpa_dev, struct ifcvf_adapter, vdpa); +} + +static struct ifcvf_hw *vdpa_to_vf(struct vdpa_device *vdpa_dev) +{ + struct ifcvf_adapter *adapter = vdpa_to_adapter(vdpa_dev); + + return &adapter->vf; +} + +static u64 ifcvf_vdpa_get_features(struct vdpa_device *vdpa_dev) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + u64 features; + + features = ifcvf_get_features(vf) & IFCVF_SUPPORTED_FEATURES; + + return features; +} + +static int ifcvf_vdpa_set_features(struct vdpa_device *vdpa_dev, u64 features) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + vf->req_features = features; + + return 0; +} + +static u8 ifcvf_vdpa_get_status(struct vdpa_device *vdpa_dev) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + return ifcvf_get_status(vf); +} + +static void ifcvf_vdpa_set_status(struct vdpa_device *vdpa_dev, u8 status) +{ + struct ifcvf_adapter *adapter; + struct ifcvf_hw *vf; + + vf = vdpa_to_vf(vdpa_dev); + adapter = dev_get_drvdata(vdpa_dev->dev.parent); + + if (status == 0) { + ifcvf_stop_datapath(adapter); + ifcvf_reset_vring(adapter); + return; + } + + if (status & VIRTIO_CONFIG_S_DRIVER_OK) { + if (ifcvf_start_datapath(adapter) < 0) + IFCVF_ERR(adapter->dev, + "Failed to set ifcvf vdpa status %u\n", + status); + } + + ifcvf_set_status(vf, status); +} + +static u16 ifcvf_vdpa_get_vq_num_max(struct vdpa_device *vdpa_dev) +{ + return IFCVF_QUEUE_MAX; +} + +static u64 ifcvf_vdpa_get_vq_state(struct vdpa_device *vdpa_dev, u16 qid) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + return ifcvf_get_vq_state(vf, qid); +} + +static int ifcvf_vdpa_set_vq_state(struct vdpa_device *vdpa_dev, u16 qid, + u64 num) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + return ifcvf_set_vq_state(vf, qid, num); +} + +static void ifcvf_vdpa_set_vq_cb(struct vdpa_device *vdpa_dev, u16 qid, + struct vdpa_callback *cb) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + vf->vring[qid].cb = *cb; +} + +static void ifcvf_vdpa_set_vq_ready(struct vdpa_device *vdpa_dev, + u16 qid, bool ready) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + vf->vring[qid].ready = ready; +} + +static bool ifcvf_vdpa_get_vq_ready(struct vdpa_device *vdpa_dev, u16 qid) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + return vf->vring[qid].ready; +} + +static void ifcvf_vdpa_set_vq_num(struct vdpa_device *vdpa_dev, u16 qid, + u32 num) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + vf->vring[qid].size = num; +} + +static int ifcvf_vdpa_set_vq_address(struct vdpa_device *vdpa_dev, u16 qid, + u64 desc_area, u64 driver_area, + u64 device_area) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + vf->vring[qid].desc = desc_area; + vf->vring[qid].avail = driver_area; + vf->vring[qid].used = device_area; + + return 0; +} + +static void ifcvf_vdpa_kick_vq(struct vdpa_device *vdpa_dev, u16 qid) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + ifcvf_notify_queue(vf, qid); +} + +static u32 ifcvf_vdpa_get_generation(struct vdpa_device *vdpa_dev) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + return ioread8(&vf->common_cfg->config_generation); +} + +static u32 ifcvf_vdpa_get_device_id(struct vdpa_device *vdpa_dev) +{ + return VIRTIO_ID_NET; +} + +static u32 ifcvf_vdpa_get_vendor_id(struct vdpa_device *vdpa_dev) +{ + return IFCVF_SUBSYS_VENDOR_ID; +} + +static u16 ifcvf_vdpa_get_vq_align(struct vdpa_device *vdpa_dev) +{ + return IFCVF_QUEUE_ALIGNMENT; +} + +static void ifcvf_vdpa_get_config(struct vdpa_device *vdpa_dev, + unsigned int offset, + void *buf, unsigned int len) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + WARN_ON(offset + len > sizeof(struct virtio_net_config)); + ifcvf_read_net_config(vf, offset, buf, len); +} + +static void ifcvf_vdpa_set_config(struct vdpa_device *vdpa_dev, + unsigned int offset, const void *buf, + unsigned int len) +{ + struct ifcvf_hw *vf = vdpa_to_vf(vdpa_dev); + + WARN_ON(offset + len > sizeof(struct virtio_net_config)); + ifcvf_write_net_config(vf, offset, buf, len); +} + +static void ifcvf_vdpa_set_config_cb(struct vdpa_device *vdpa_dev, + struct vdpa_callback *cb) +{ + /* We don't support config interrupt */ +} + +static void ifcvf_free_irq(struct ifcvf_adapter *adapter) +{ + struct ifcvf_hw *vf = ifcvf_private_to_vf(adapter); + struct pci_dev *pdev = to_pci_dev(adapter->dev); + int i, vector, irq; + + for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { + if (vf->vring[i].irq) { + vector = i + IFCVF_MSI_QUEUE_OFF; + irq = pci_irq_vector(pdev, vector); + free_irq(irq, &vf->vring[i]); + } + } +} + + +static void ifcvf_free(struct vdpa_device *vdpa_dev) +{ + struct ifcvf_adapter *adapter = vdpa_to_adapter(vdpa_dev); + struct pci_dev *pdev = to_pci_dev(adapter->dev); + struct ifcvf_hw *vf = &adapter->vf; + int i; + + for (i = 0; i < IFCVF_PCI_MAX_RESOURCE; i++) { + if (vf->mem_resource[i].addr) { + pci_iounmap(pdev, vf->mem_resource[i].addr); + vf->mem_resource[i].addr = NULL; + } + } + + ifcvf_free_irq(adapter); + pci_free_irq_vectors(pdev); + pci_release_regions(pdev); + pci_disable_device(pdev); +} + +/* + * IFCVF currently does't have on-chip IOMMU, so not + * implemented set_map()/dma_map()/dma_unmap() + */ +static const struct vdpa_config_ops ifc_vdpa_ops = { + .get_features = ifcvf_vdpa_get_features, + .set_features = ifcvf_vdpa_set_features, + .get_status = ifcvf_vdpa_get_status, + .set_status = ifcvf_vdpa_set_status, + .get_vq_num_max = ifcvf_vdpa_get_vq_num_max, + .get_vq_state = ifcvf_vdpa_get_vq_state, + .set_vq_state = ifcvf_vdpa_set_vq_state, + .set_vq_cb = ifcvf_vdpa_set_vq_cb, + .set_vq_ready = ifcvf_vdpa_set_vq_ready, + .get_vq_ready = ifcvf_vdpa_get_vq_ready, + .set_vq_num = ifcvf_vdpa_set_vq_num, + .set_vq_address = ifcvf_vdpa_set_vq_address, + .kick_vq = ifcvf_vdpa_kick_vq, + .get_generation = ifcvf_vdpa_get_generation, + .get_device_id = ifcvf_vdpa_get_device_id, + .get_vendor_id = ifcvf_vdpa_get_vendor_id, + .get_vq_align = ifcvf_vdpa_get_vq_align, + .get_config = ifcvf_vdpa_get_config, + .set_config = ifcvf_vdpa_set_config, + .set_config_cb = ifcvf_vdpa_set_config_cb, + .free = ifcvf_free, +}; + +static int ifcvf_request_irq(struct ifcvf_adapter *adapter) +{ + struct pci_dev *pdev = to_pci_dev(adapter->dev); + struct ifcvf_hw *vf = &adapter->vf; + int vector, i, ret, irq; + + + for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { + snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n", + pci_name(pdev), i); + vector = i + IFCVF_MSI_QUEUE_OFF; + irq = pci_irq_vector(pdev, vector); + ret = request_irq(irq, ifcvf_intr_handler, 0, + vf->vring[i].msix_name, &vf->vring[i]); + if (ret) { + IFCVF_ERR(adapter->dev, + "Failed to request irq for vq %d\n", i); + return ret; + } + vf->vring[i].irq = irq; + } + + return 0; +} + +static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct device *dev = &pdev->dev; + struct ifcvf_adapter *adapter; + struct ifcvf_hw *vf; + int ret, i; + + ret = pci_enable_device(pdev); + if (ret) { + IFCVF_ERR(&pdev->dev, "Failed to enable device\n"); + goto err_enable; + } + + ret = pci_request_regions(pdev, IFCVF_DRIVER_NAME); + if (ret) { + IFCVF_ERR(&pdev->dev, "Failed to request MMIO region\n"); + goto err_regions; + } + + ret = pci_alloc_irq_vectors(pdev, IFCVF_MAX_INTR, + IFCVF_MAX_INTR, PCI_IRQ_MSIX); + if (ret < 0) { + IFCVF_ERR(&pdev->dev, "Failed to alloc irq vectors\n"); + goto err_vectors; + } + + adapter = vdpa_alloc_device(ifcvf_adapter, vdpa, dev, &ifc_vdpa_ops); + if (adapter == NULL) { + IFCVF_ERR(&pdev->dev, "Failed to allocate vDPA structure"); + ret = -ENOMEM; + goto err_alloc; + } + + adapter->dev = dev; + pci_set_master(pdev); + pci_set_drvdata(pdev, adapter); + + ret = ifcvf_request_irq(adapter); + if (ret) { + IFCVF_ERR(&pdev->dev, "Failed to request MSI-X irq\n"); + goto err_msix; + } + + vf = &adapter->vf; + for (i = 0; i < IFCVF_PCI_MAX_RESOURCE; i++) { + vf->mem_resource[i].phys_addr = pci_resource_start(pdev, i); + vf->mem_resource[i].len = pci_resource_len(pdev, i); + if (!vf->mem_resource[i].len) + continue; + + vf->mem_resource[i].addr = pci_iomap_range(pdev, i, 0, + vf->mem_resource[i].len); + if (!vf->mem_resource[i].addr) { + IFCVF_ERR(&pdev->dev, + "Failed to map IO resource %d\n", i); + ret = -EINVAL; + goto err_msix; + } + } + + if (ifcvf_init_hw(vf, pdev) < 0) { + ret = -EINVAL; + goto err_msix; + } + + ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); + if (ret) + ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); + + if (ret) { + IFCVF_ERR(adapter->dev, "No usable DMA confiugration\n"); + ret = -EINVAL; + goto err_msix; + } + + adapter->vdpa.dma_dev = dev; + ret = vdpa_register_device(&adapter->vdpa); + if (ret) { + IFCVF_ERR(adapter->dev, "Failed to register ifcvf to vdpa bus"); + goto err_msix; + } + + return 0; + +err_msix: + put_device(&adapter->vdpa.dev); + return ret; +err_alloc: + pci_free_irq_vectors(pdev); +err_vectors: + pci_release_regions(pdev); +err_regions: + pci_disable_device(pdev); +err_enable: + return ret; +} + +static void ifcvf_remove(struct pci_dev *pdev) +{ + struct ifcvf_adapter *adapter = pci_get_drvdata(pdev); + + vdpa_unregister_device(&adapter->vdpa); +} + +static struct pci_device_id ifcvf_pci_ids[] = { + { PCI_DEVICE_SUB(IFCVF_VENDOR_ID, + IFCVF_DEVICE_ID, + IFCVF_SUBSYS_VENDOR_ID, + IFCVF_SUBSYS_DEVICE_ID) }, + { 0 }, +}; +MODULE_DEVICE_TABLE(pci, ifcvf_pci_ids); + +static struct pci_driver ifcvf_driver = { + .name = IFCVF_DRIVER_NAME, + .id_table = ifcvf_pci_ids, + .probe = ifcvf_probe, + .remove = ifcvf_remove, +}; + +module_pci_driver(ifcvf_driver); + +MODULE_LICENSE("GPL v2"); +MODULE_VERSION(VERSION_STRING);