From patchwork Fri Oct 31 14:05:43 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Auger Eric X-Patchwork-Id: 39915 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 6AFF2202FE for ; Fri, 31 Oct 2014 14:06:42 +0000 (UTC) Received: by mail-lb0-f198.google.com with SMTP id p9sf364773lbv.5 for ; Fri, 31 Oct 2014 07:06:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=QEu9RRrhKeO30k1Jdw9Ns9AW6HXeqlsv19HPeQ06la0=; b=m6dHr/eZ4I/dRuEWQd8uddscVTdEsYQvdyL3m6w9TIF+cJIUySQVRboCel/jA1V/cN vxVa/g3N6bB/18mDeJHLoR71PMa+DhfaEx/9wAy4J3QmknrDHIbTmusG+AfVxNfqFai7 t9BF+IjruiO3rcqd1exZoRwFQrgmJyif/pYnCCadhVYWuGknVI4raCTgAebsQytJijdl Ux9GChjSjtd+WKdwrEgMwQvaXjLFhahxbkfF+7O1hrSg4162/CYn0iCHaHUXzqWgZGmR /gggCozXGtk8j8PBA44c19IR68ByvXuLFpTUPYHceSozZr1wZ4znt2UiMvutwHwBcnni ZbLw== X-Gm-Message-State: ALoCoQnFMORlcomsCUSsNs8I8g45nkrXIFjULHDgTJTh1m7TOR8R77uHF4S26IkimgQe3W6eEaNg X-Received: by 10.152.6.130 with SMTP id b2mr16472laa.10.1414764401267; Fri, 31 Oct 2014 07:06:41 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.27.9 with SMTP id p9ls487879lag.80.gmail; Fri, 31 Oct 2014 07:06:41 -0700 (PDT) X-Received: by 10.152.20.199 with SMTP id p7mr26713608lae.49.1414764401114; Fri, 31 Oct 2014 07:06:41 -0700 (PDT) Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com. [209.85.217.171]) by mx.google.com with ESMTPS id gl1si16918092lbc.11.2014.10.31.07.06.41 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Oct 2014 07:06:41 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.171 as permitted sender) client-ip=209.85.217.171; Received: by mail-lb0-f171.google.com with SMTP id u10so4239125lbd.16 for ; Fri, 31 Oct 2014 07:06:41 -0700 (PDT) X-Received: by 10.152.116.102 with SMTP id jv6mr26633434lab.40.1414764400966; Fri, 31 Oct 2014 07:06:40 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.112.84.229 with SMTP id c5csp207578lbz; Fri, 31 Oct 2014 07:06:39 -0700 (PDT) X-Received: by 10.194.189.82 with SMTP id gg18mr3113794wjc.115.1414764399494; Fri, 31 Oct 2014 07:06:39 -0700 (PDT) Received: from mail-wg0-f54.google.com (mail-wg0-f54.google.com. [74.125.82.54]) by mx.google.com with ESMTPS id p17si12012283wie.48.2014.10.31.07.06.39 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Oct 2014 07:06:39 -0700 (PDT) Received-SPF: pass (google.com: domain of eric.auger@linaro.org designates 74.125.82.54 as permitted sender) client-ip=74.125.82.54; Received: by mail-wg0-f54.google.com with SMTP id n12so1227632wgh.27 for ; Fri, 31 Oct 2014 07:06:39 -0700 (PDT) X-Received: by 10.194.103.230 with SMTP id fz6mr27874693wjb.53.1414764379826; Fri, 31 Oct 2014 07:06:19 -0700 (PDT) Received: from midway01-04-00.lavalab ([88.98.47.97]) by mx.google.com with ESMTPSA id v10sm12504563wiy.23.2014.10.31.07.06.18 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 31 Oct 2014 07:06:19 -0700 (PDT) From: Eric Auger To: eric.auger@st.com, christoffer.dall@linaro.org, qemu-devel@nongnu.org, agraf@suse.de, pbonzini@redhat.com, kim.phillips@freescale.com, a.rigo@virtualopensystems.com, manish.jaggi@caviumnetworks.com, joel.schopp@amd.com Cc: eric.auger@linaro.org, kvmarm@lists.cs.columbia.edu, patches@linaro.org, alex.williamson@redhat.com, peter.maydell@linaro.org, will.deacon@arm.com, Bharat.Bhushan@freescale.com, stuart.yoder@freescale.com, a.motakis@virtualopensystems.com, Kim Phillips Subject: [PATCH v7 09/16] hw/vfio/platform: add vfio-platform support Date: Fri, 31 Oct 2014 14:05:43 +0000 Message-Id: <1414764350-5140-10-git-send-email-eric.auger@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1414764350-5140-1-git-send-email-eric.auger@linaro.org> References: <1414764350-5140-1-git-send-email-eric.auger@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: eric.auger@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.171 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Minimal VFIO platform implementation supporting - register space user mapping, - IRQ assignment based on eventfds handled on qemu side. irqfd kernel acceleration comes in a subsequent patch. Signed-off-by: Kim Phillips Signed-off-by: Eric Auger --- v6 -> v7: - compat is not exposed anymore as a user option. Rationale is the vfio device became abstract and a specialization is needed anyway. The derived device must set the compat string. - in v6 vfio_start_irq_injection was exposed in vfio-platform.h. A new function dubbed vfio_register_irq_starter replaces it. It registers a machine init done notifier that programs & starts all dynamic VFIO device IRQs. This function is supposed to be called by the machine file. A set of static helper routines are added too. It must be called before the creation of the platform bus device. v5 -> v6: - vfio_device property renamed into host property - correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl and remove PCI related comment - remove declaration of vfio_setup_irqfd and irqfd_allowed property.Both belong to next patch (irqfd) - remove declaration of vfio_intp_interrupt in vfio-platform.h - functions that can be static get this characteristic - remove declarations of vfio_region_ops, vfio_memory_listener, group_list, vfio_address_spaces. All are moved to vfio-common.h - remove vfio_put_device declaration and definition - print_regions removed. code moved into vfio_populate_regions - replace DPRINTF by trace events - new helper routine to set the trigger eventfd - dissociate intp init from the injection enablement: vfio_enable_intp renamed into vfio_init_intp and new function named vfio_start_eventfd_injection - injection start moved to vfio_start_irq_injection (not anymore in vfio_populate_interrupt) - new start_irq_fn field in VFIOPlatformDevice corresponding to the function that will be used for starting injection - user handled eventfd: x add mutex to protect IRQ state & list manipulation, x correct misleading comment in vfio_intp_interrupt. x Fix bugs thanks to fake interrupt modality - VFIOPlatformDeviceClass becomes abstract - add error_setg in vfio_platform_realize v4 -> v5: - vfio-plaform.h included first - cleanup error handling in *populate*, vfio_get_device, vfio_enable_intp - vfio_put_device not called anymore - add some includes to follow vfio policy v3 -> v4: [Eric Auger] - merge of "vfio: Add initial IRQ support in platform device" to get a full functional patch although perfs are limited. - removal of unrealize function since I currently understand it is only used with device hot-plug feature. v2 -> v3: [Eric Auger] - further factorization between PCI and platform (VFIORegion, VFIODevice). same level of functionality. <= v2: [Kim Philipps] - Initial Creation of the device supporting register space mapping --- hw/vfio/Makefile.objs | 1 + hw/vfio/platform.c | 672 ++++++++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 1 + include/hw/vfio/vfio-platform.h | 87 ++++++ trace-events | 12 + 5 files changed, 773 insertions(+) create mode 100644 hw/vfio/platform.c create mode 100644 include/hw/vfio/vfio-platform.h diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs index e31f30e..c5c76fe 100644 --- a/hw/vfio/Makefile.objs +++ b/hw/vfio/Makefile.objs @@ -1,4 +1,5 @@ ifeq ($(CONFIG_LINUX), y) obj-$(CONFIG_SOFTMMU) += common.o obj-$(CONFIG_PCI) += pci.o +obj-$(CONFIG_SOFTMMU) += platform.o endif diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c new file mode 100644 index 0000000..9f66610 --- /dev/null +++ b/hw/vfio/platform.c @@ -0,0 +1,672 @@ +/* + * vfio based device assignment support - platform devices + * + * Copyright Linaro Limited, 2014 + * + * Authors: + * Kim Phillips + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Based on vfio based PCI device assignment support: + * Copyright Red Hat, Inc. 2012 + */ + +#include +#include + +#include "hw/vfio/vfio-platform.h" +#include "qemu/error-report.h" +#include "qemu/range.h" +#include "sysemu/sysemu.h" +#include "exec/memory.h" +#include "qemu/queue.h" +#include "hw/sysbus.h" +#include "trace.h" +#include "hw/platform-bus.h" + +static void vfio_intp_interrupt(VFIOINTp *intp); +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp); +static int vfio_set_trigger_eventfd(VFIOINTp *intp, + eventfd_user_side_handler_t handler); + +/* + * Functions only used when eventfd are handled on user-side + * ie. without irqfd + */ + +/** + * vfio_platform_eoi - IRQ completion routine + * @vbasedev: the VFIO device + * + * de-asserts the active virtual IRQ and unmask the physical IRQ + * (masked by the VFIO driver). Handle pending IRQs if any. + * eoi function is called on the first access to any MMIO region + * after an IRQ was triggered. It is assumed this access corresponds + * to the IRQ status register reset. With such a mechanism, a single + * IRQ can be handled at a time since there is no way to know which + * IRQ was completed by the guest (we would need additional details + * about the IRQ status register mask) + */ +static void vfio_platform_eoi(VFIODevice *vbasedev) +{ + VFIOINTp *intp; + VFIOPlatformDevice *vdev = + container_of(vbasedev, VFIOPlatformDevice, vbasedev); + + qemu_mutex_lock(&vdev->intp_mutex); + QLIST_FOREACH(intp, &vdev->intp_list, next) { + if (intp->state == VFIO_IRQ_ACTIVE) { + trace_vfio_platform_eoi(intp->pin, + event_notifier_get_fd(&intp->interrupt)); + intp->state = VFIO_IRQ_INACTIVE; + + /* deassert the virtual IRQ and unmask physical one */ + qemu_set_irq(intp->qemuirq, 0); + vfio_unmask_irqindex(vbasedev, intp->pin); + + /* a single IRQ can be active at a time */ + break; + } + } + /* in case there are pending IRQs, handle them one at a time */ + if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) { + intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue); + trace_vfio_platform_eoi_handle_pending(intp->pin); + qemu_mutex_unlock(&vdev->intp_mutex); + vfio_intp_interrupt(intp); + qemu_mutex_lock(&vdev->intp_mutex); + QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext); + qemu_mutex_unlock(&vdev->intp_mutex); + } else { + qemu_mutex_unlock(&vdev->intp_mutex); + } +} + +/** + * vfio_mmap_set_enabled - enable/disable the fast path mode + * @vdev: the VFIO platform device + * @enabled: the target mmap state + * + * true ~ fast path = MMIO region is mmaped (no KVM TRAP) + * false ~ slow path = MMIO region is trapped and region callbacks + * are called slow path enables to trap the IRQ status register + * guest reset +*/ + +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled) +{ + VFIORegion *region; + int i; + + trace_vfio_platform_mmap_set_enabled(enabled); + + for (i = 0; i < vdev->vbasedev.num_regions; i++) { + region = vdev->regions[i]; + + /* register space is unmapped to trap EOI */ + memory_region_set_enabled(®ion->mmap_mem, enabled); + } +} + +/** + * vfio_intp_mmap_enable - timer function, restores the fast path + * if there is no more active IRQ + * @opaque: actually points to the VFIO platform device + * + * Called on mmap timer timout, this function checks whether the + * IRQ is still active and in the negative restores the fast path. + * by construction a single eventfd is handled at a time. + * if the IRQ is still active, the timer is restarted. + */ +static void vfio_intp_mmap_enable(void *opaque) +{ + VFIOINTp *tmp; + VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque; + + qemu_mutex_lock(&vdev->intp_mutex); + QLIST_FOREACH(tmp, &vdev->intp_list, next) { + if (tmp->state == VFIO_IRQ_ACTIVE) { + trace_vfio_platform_intp_mmap_enable(tmp->pin); + /* re-program the timer to check active status later */ + timer_mod(vdev->mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + vdev->mmap_timeout); + qemu_mutex_unlock(&vdev->intp_mutex); + return; + } + } + vfio_mmap_set_enabled(vdev, true); + qemu_mutex_unlock(&vdev->intp_mutex); +} + +/** + * vfio_intp_interrupt - The user-side eventfd handler + * @opaque: opaque pointer which in practice is the VFIOINTp* + * + * the function can be entered + * - in event handler context: this IRQ is inactive + * in that case, the vIRQ is injected into the guest if there + * is no other active or pending IRQ. + * - in IOhandler context: this IRQ is pending. + * there is no ACTIVE IRQ + */ +static void vfio_intp_interrupt(VFIOINTp *intp) +{ + int ret; + VFIOINTp *tmp; + VFIOPlatformDevice *vdev = intp->vdev; + bool delay_handling = false; + + qemu_mutex_lock(&vdev->intp_mutex); + if (intp->state == VFIO_IRQ_INACTIVE) { + QLIST_FOREACH(tmp, &vdev->intp_list, next) { + if (tmp->state == VFIO_IRQ_ACTIVE || + tmp->state == VFIO_IRQ_PENDING) { + delay_handling = true; + break; + } + } + } + if (delay_handling) { + /* + * the new IRQ gets a pending status and is pushed in + * the pending queue + */ + intp->state = VFIO_IRQ_PENDING; + trace_vfio_intp_interrupt_set_pending(intp->pin); + QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue, + intp, pqnext); + ret = event_notifier_test_and_clear(&intp->interrupt); + qemu_mutex_unlock(&vdev->intp_mutex); + return; + } + + /* no active IRQ, the new IRQ can be forwarded to the guest */ + trace_vfio_platform_intp_interrupt(intp->pin, + event_notifier_get_fd(&intp->interrupt)); + + if (intp->state == VFIO_IRQ_INACTIVE) { + ret = event_notifier_test_and_clear(&intp->interrupt); + if (!ret) { + error_report("Error when clearing fd=%d (ret = %d)\n", + event_notifier_get_fd(&intp->interrupt), ret); + } + } /* else this is a pending IRQ that moves to ACTIVE state */ + + intp->state = VFIO_IRQ_ACTIVE; + + /* sets slow path */ + vfio_mmap_set_enabled(vdev, false); + + /* trigger the virtual IRQ */ + qemu_set_irq(intp->qemuirq, 1); + + /* schedule the mmap timer which will restore mmap path after EOI*/ + if (vdev->mmap_timeout) { + timer_mod(vdev->mmap_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + vdev->mmap_timeout); + } + qemu_mutex_unlock(&vdev->intp_mutex); +} + +/** + * vfio_start_eventfd_injection - starts the virtual IRQ injection using + * user-side handled eventfds + * @intp: the IRQ struct pointer + */ + +static int vfio_start_eventfd_injection(VFIOINTp *intp) +{ + int ret; + VFIODevice *vbasedev = &intp->vdev->vbasedev; + + vfio_mask_irqindex(vbasedev, intp->pin); + + ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt); + if (ret) { + error_report("vfio: Error: Failed to pass IRQ fd to the driver: %m"); + vfio_unmask_irqindex(vbasedev, intp->pin); + return ret; + } + vfio_unmask_irqindex(vbasedev, intp->pin); + return 0; +} + +/* + * Functions used whatever the injection method + */ + +/** + * vfio_set_trigger_eventfd - set VFIO eventfd handling + * ie. program the VFIO driver to associates a given IRQ index + * with a fd handler + * + * @intp: IRQ struct pointer + * @handler: handler to be called on eventfd trigger + */ +static int vfio_set_trigger_eventfd(VFIOINTp *intp, + eventfd_user_side_handler_t handler) +{ + VFIODevice *vbasedev = &intp->vdev->vbasedev; + struct vfio_irq_set *irq_set; + int argsz, ret; + int32_t *pfd; + + argsz = sizeof(*irq_set) + sizeof(*pfd); + irq_set = g_malloc0(argsz); + irq_set->argsz = argsz; + irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER; + irq_set->index = intp->pin; + irq_set->start = 0; + irq_set->count = 1; + pfd = (int32_t *)&irq_set->data; + *pfd = event_notifier_get_fd(&intp->interrupt); + qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp); + ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set); + g_free(irq_set); + if (ret < 0) { + error_report("vfio: Failed to set trigger eventfd: %m"); + qemu_set_fd_handler(*pfd, NULL, NULL, NULL); + } + return ret; +} + +/* not implemented yet */ +static bool vfio_platform_compute_needs_reset(VFIODevice *vdev) +{ +return false; +} + +/* not implemented yet */ +static int vfio_platform_hot_reset_multi(VFIODevice *vdev) +{ +return 0; +} + +/** + * vfio_init_intp - allocate, initialize the IRQ struct pointer + * and add it into the list of IRQ + * @vbasedev: the VFIO device + * @index: VFIO device IRQ index + */ +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index) +{ + int ret; + VFIOPlatformDevice *vdev = + container_of(vbasedev, VFIOPlatformDevice, vbasedev); + SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev); + VFIOINTp *intp; + + /* allocate and populate a new VFIOINTp structure put in a queue list */ + intp = g_malloc0(sizeof(*intp)); + intp->vdev = vdev; + intp->pin = index; + intp->state = VFIO_IRQ_INACTIVE; + sysbus_init_irq(sbdev, &intp->qemuirq); + + /* Get an eventfd for trigger */ + ret = event_notifier_init(&intp->interrupt, 0); + if (ret) { + g_free(intp); + error_report("vfio: Error: trigger event_notifier_init failed "); + return NULL; + } + + /* store the new intp in qlist */ + QLIST_INSERT_HEAD(&vdev->intp_list, intp, next); + return intp; +} + +/** + * vfio_populate_device - initialize MMIO region and IRQ + * @vbasedev: the VFIO device + * + * query the VFIO device for exposed MMIO regions and IRQ and + * populate the associated fields in the device struct + */ +static int vfio_populate_device(VFIODevice *vbasedev) +{ + struct vfio_irq_info irq = { .argsz = sizeof(irq) }; + struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) }; + VFIOINTp *intp; + int i, ret = 0; + VFIOPlatformDevice *vdev = + container_of(vbasedev, VFIOPlatformDevice, vbasedev); + + vdev->regions = g_malloc0(sizeof(VFIORegion *) * vbasedev->num_regions); + + for (i = 0; i < vbasedev->num_regions; i++) { + vdev->regions[i] = g_malloc0(sizeof(VFIORegion)); + reg_info.index = i; + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, ®_info); + if (ret) { + error_report("vfio: Error getting region %d info: %m", i); + goto error; + } + vdev->regions[i]->flags = reg_info.flags; + vdev->regions[i]->size = reg_info.size; + vdev->regions[i]->fd_offset = reg_info.offset; + vdev->regions[i]->nr = i; + vdev->regions[i]->vbasedev = vbasedev; + + trace_vfio_platform_populate_regions(vdev->regions[i]->nr, + (unsigned long)vdev->regions[i]->flags, + (unsigned long)vdev->regions[i]->size, + vdev->regions[i]->vbasedev->fd, + (unsigned long)vdev->regions[i]->fd_offset); + } + + vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + vfio_intp_mmap_enable, vdev); + + QSIMPLEQ_INIT(&vdev->pending_intp_queue); + + for (i = 0; i < vbasedev->num_irqs; i++) { + irq.index = i; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq); + if (ret) { + error_printf("vfio: error getting device %s irq info", + vbasedev->name); + return ret; + } else { + trace_vfio_platform_populate_interrupts(irq.index, + irq.count, + irq.flags); + intp = vfio_init_intp(vbasedev, irq.index); + if (!intp) { + error_report("vfio: Error installing IRQ %d up", i); + return ret; + } + } + } + return 0; +error: + return ret; +} + +/* + * vfio_start_irq_injection - associates a virtual irq to a + * VFIO IRQ index and start the injection of this IRQ + * @s: SysBus Device + * @index: VFIO IRQ index + * @virq: the virtual IRQ number, aka gsi + * + * this function is called when the device tree is built + */ +static void vfio_start_irq_injection(SysBusDevice *s, int index, int virq) +{ + VFIOPlatformDevice *vdev = container_of(s, VFIOPlatformDevice, sbdev); + VFIOINTp *intp; + + QLIST_FOREACH(intp, &vdev->intp_list, next) { + if (intp->pin == index) { + intp->virtualID = virq; + vdev->start_irq_fn(intp); + } + } +} + +/* specialized functions ofr VFIO Platform devices */ +static VFIODeviceOps vfio_platform_ops = { + .vfio_compute_needs_reset = vfio_platform_compute_needs_reset, + .vfio_hot_reset_multi = vfio_platform_hot_reset_multi, + .vfio_eoi = vfio_platform_eoi, + .vfio_populate_device = vfio_populate_device, +}; + +/** + * vfio_base_device_init - implements some of the VFIO mechanics + * @vbasedev: the VFIO device + * + * retrieves the group the device belongs to and get the device fd + * returns the VFIO device fd + * precondition: the device name must be initialized + */ +static int vfio_base_device_init(VFIODevice *vbasedev) +{ + VFIOGroup *group; + VFIODevice *vbasedev_iter; + char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name; + ssize_t len; + struct stat st; + int groupid; + int ret; + + /* name must be set prior to the call */ + if (!vbasedev->name) { + return -EINVAL; + } + + /* Check that the host device exists */ + snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/", + vbasedev->name); + + if (stat(path, &st) < 0) { + error_report("vfio: error: no such host device: %s", path); + return -errno; + } + + strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1); + len = readlink(path, iommu_group_path, sizeof(path)); + if (len <= 0 || len >= sizeof(path)) { + error_report("vfio: error no iommu_group for device"); + return len < 0 ? -errno : ENAMETOOLONG; + } + + iommu_group_path[len] = 0; + group_name = basename(iommu_group_path); + + if (sscanf(group_name, "%d", &groupid) != 1) { + error_report("vfio: error reading %s: %m", path); + return -errno; + } + + trace_vfio_platform_base_device_init(vbasedev->name, groupid); + + group = vfio_get_group(groupid, &address_space_memory); + if (!group) { + error_report("vfio: failed to get group %d", groupid); + return -ENOENT; + } + + snprintf(path, sizeof(path), "%s", vbasedev->name); + + QLIST_FOREACH(vbasedev_iter, &group->device_list, next) { + if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) { + error_report("vfio: error: device %s is already attached", path); + vfio_put_group(group); + return -EBUSY; + } + } + ret = vfio_get_device(group, path, vbasedev); + if (ret) { + error_report("vfio: failed to get device %s", path); + vfio_put_group(group); + } + return ret; +} + +/** + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a + * given index + * @vdev: the VFIO platform device + * @nr: the index of the region + * + * init the top memory region and the mmapped memroy region beneath + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object + * and could not be passed to memory region functions +*/ +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr) +{ + VFIORegion *region = vdev->regions[nr]; + unsigned size = region->size; + char name[64]; + + if (!size) { + return; + } + + snprintf(name, sizeof(name), "VFIO %s region %d", + vdev->vbasedev.name, nr); + + /* A "slow" read/write mapping underlies all regions */ + memory_region_init_io(®ion->mem, OBJECT(vdev), &vfio_region_ops, + region, name, size); + + strncat(name, " mmap", sizeof(name) - strlen(name) - 1); + + if (vfio_mmap_region(OBJECT(vdev), region, ®ion->mem, + ®ion->mmap_mem, ®ion->mmap, size, 0, name)) { + error_report("%s unsupported. Performance may be slow", name); + } +} + +/** + * vfio_platform_realize - the device realize function + * @dev: device state pointer + * @errp: error + * + * initialize the device, its memory regions and IRQ structures + * IRQ are started separately + */ +static void vfio_platform_realize(DeviceState *dev, Error **errp) +{ + VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev); + SysBusDevice *sbdev = SYS_BUS_DEVICE(dev); + VFIODevice *vbasedev = &vdev->vbasedev; + int i, ret; + + vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM; + vbasedev->ops = &vfio_platform_ops; + vdev->start_irq_fn = vfio_start_eventfd_injection; + + trace_vfio_platform_realize(vbasedev->name, vdev->compat); + + ret = vfio_base_device_init(vbasedev); + if (ret) { + error_setg(errp, "vfio: vfio_base_device_init failed for %s", + vbasedev->name); + return; + } + + for (i = 0; i < vbasedev->num_regions; i++) { + vfio_map_region(vdev, i); + sysbus_init_mmio(sbdev, &vdev->regions[i]->mem); + } +} + +/* + * Mechanics to program/start irq injection on machine init done notifier: + * this is needed since at finalize time, the device IRQ are not yet + * bound to the platform bus IRQ. It is assumed here dynamic instantiation + * always is used. Binding to the platform bus IRQ happens on a machine + * init done notifier registered by the machine file. After its execution + * we execute a new notifier that actually starts the injection. When using + * irqfd, programming the injection consists in associating eventfds to + * GSI number,ie. virtual IRQ number + */ + +typedef struct VfioIrqStarterNotifierParams { + unsigned int platform_bus_first_irq; + Notifier notifier; +} VfioIrqStarterNotifierParams; + +typedef struct VfioIrqStartParams { + PlatformBusDevice *pbus; + int platform_bus_first_irq; +} VfioIrqStartParams; + +/* Start injection of IRQ for a specific VFIO device */ +static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque) +{ + int i; + VfioIrqStartParams *p = opaque; + VFIOPlatformDevice *vdev; + VFIODevice *vbasedev; + uint64_t irq_number; + PlatformBusDevice *pbus = p->pbus; + int platform_bus_first_irq = p->platform_bus_first_irq; + + if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) { + vdev = VFIO_PLATFORM_DEVICE(sbdev); + vbasedev = &vdev->vbasedev; + for (i = 0; i < vbasedev->num_irqs; i++) { + irq_number = platform_bus_get_irqn(pbus, sbdev, i) + + platform_bus_first_irq; + vfio_start_irq_injection(sbdev, i, irq_number); + } + } + return 0; +} + +/* loop on all VFIO platform devices and start their IRQ injection */ +static void vfio_irq_starter_notify(Notifier *notifier, void *data) +{ + VfioIrqStarterNotifierParams *p = + container_of(notifier, VfioIrqStarterNotifierParams, notifier); + DeviceState *dev = + qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE); + PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev); + + if (pbus->done_gathering) { + VfioIrqStartParams data = { + .pbus = pbus, + .platform_bus_first_irq = p->platform_bus_first_irq, + }; + + foreach_dynamic_sysbus_device(vfio_irq_starter, &data); + } +} + +/* registers the machine init done notifier that will start VFIO IRQ */ +void vfio_register_irq_starter(int platform_bus_first_irq) +{ + VfioIrqStarterNotifierParams *p = g_new(VfioIrqStarterNotifierParams, 1); + + p->platform_bus_first_irq = platform_bus_first_irq; + p->notifier.notify = vfio_irq_starter_notify; + qemu_add_machine_init_done_notifier(&p->notifier); +} + +static const VMStateDescription vfio_platform_vmstate = { + .name = TYPE_VFIO_PLATFORM, + .unmigratable = 1, +}; + +static Property vfio_platform_dev_properties[] = { + DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name), + DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice, + mmap_timeout, 1100), + DEFINE_PROP_END_OF_LIST(), +}; + +static void vfio_platform_class_init(ObjectClass *klass, void *data) +{ + DeviceClass *dc = DEVICE_CLASS(klass); + + dc->realize = vfio_platform_realize; + dc->props = vfio_platform_dev_properties; + dc->vmsd = &vfio_platform_vmstate; + dc->desc = "VFIO-based platform device assignment"; + set_bit(DEVICE_CATEGORY_MISC, dc->categories); +} + +static const TypeInfo vfio_platform_dev_info = { + .name = TYPE_VFIO_PLATFORM, + .parent = TYPE_SYS_BUS_DEVICE, + .instance_size = sizeof(VFIOPlatformDevice), + .class_init = vfio_platform_class_init, + .class_size = sizeof(VFIOPlatformDeviceClass), + .abstract = true, +}; + +static void register_vfio_platform_dev_type(void) +{ + type_register_static(&vfio_platform_dev_info); +} + +type_init(register_vfio_platform_dev_type) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index e7fc280..83c7876 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -43,6 +43,7 @@ enum { VFIO_DEVICE_TYPE_PCI = 0, + VFIO_DEVICE_TYPE_PLATFORM = 1, }; typedef struct VFIORegion { diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h new file mode 100644 index 0000000..18e6807 --- /dev/null +++ b/include/hw/vfio/vfio-platform.h @@ -0,0 +1,87 @@ +/* + * vfio based device assignment support - platform devices + * + * Copyright Linaro Limited, 2014 + * + * Authors: + * Kim Phillips + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + * Based on vfio based PCI device assignment support: + * Copyright Red Hat, Inc. 2012 + */ + +#ifndef HW_VFIO_VFIO_PLATFORM_H +#define HW_VFIO_VFIO_PLATFORM_H + +#include "hw/sysbus.h" +#include "hw/vfio/vfio-common.h" +#include "qemu/event_notifier.h" +#include "qemu/queue.h" +#include "hw/irq.h" + +#define TYPE_VFIO_PLATFORM "vfio-platform" + +enum { + VFIO_IRQ_INACTIVE = 0, + VFIO_IRQ_PENDING = 1, + VFIO_IRQ_ACTIVE = 2, + /* VFIO_IRQ_ACTIVE_AND_PENDING cannot happen with VFIO */ +}; + +typedef struct VFIOINTp { + QLIST_ENTRY(VFIOINTp) next; /* entry for IRQ list */ + QSIMPLEQ_ENTRY(VFIOINTp) pqnext; /* entry for pending IRQ queue */ + EventNotifier interrupt; /* eventfd triggered on interrupt */ + EventNotifier unmask; /* eventfd for unmask on QEMU bypass */ + qemu_irq qemuirq; + struct VFIOPlatformDevice *vdev; /* back pointer to device */ + int state; /* inactive, pending, active */ + bool kvm_accel; /* set when QEMU bypass through KVM enabled */ + uint8_t pin; /* index */ + uint8_t virtualID; /* virtual IRQ */ +} VFIOINTp; + +typedef int (*start_irq_fn_t)(VFIOINTp *intp); + +typedef struct VFIOPlatformDevice { + SysBusDevice sbdev; + VFIODevice vbasedev; /* not a QOM object */ + VFIORegion **regions; + QLIST_HEAD(, VFIOINTp) intp_list; /* list of IRQ */ + /* queue of pending IRQ */ + QSIMPLEQ_HEAD(pending_intp_queue, VFIOINTp) pending_intp_queue; + char *compat; /* compatibility string */ + uint32_t mmap_timeout; /* delay to re-enable mmaps after interrupt */ + QEMUTimer *mmap_timer; /* enable mmaps after periods w/o interrupts */ + start_irq_fn_t start_irq_fn; + QemuMutex intp_mutex; +} VFIOPlatformDevice; + + +typedef struct VFIOPlatformDeviceClass { + /*< private >*/ + SysBusDeviceClass parent_class; + /*< public >*/ +} VFIOPlatformDeviceClass; + +#define VFIO_PLATFORM_DEVICE(obj) \ + OBJECT_CHECK(VFIOPlatformDevice, (obj), TYPE_VFIO_PLATFORM) +#define VFIO_PLATFORM_DEVICE_CLASS(klass) \ + OBJECT_CLASS_CHECK(VFIOPlatformDeviceClass, (klass), TYPE_VFIO_PLATFORM) +#define VFIO_PLATFORM_DEVICE_GET_CLASS(obj) \ + OBJECT_GET_CLASS(VFIOPlatformDeviceClass, (obj), TYPE_VFIO_PLATFORM) + +/** + * vfio_register_irq_starter - registers a machine init done notifier that + * starts IRQ injection for VFIO dynamic sysbus devices attached to the + * platform bus. + * + * @platform_bus_first_irq: the number of the first irq assigned to the + * platform bus (index in machine file global qemu_irq array) + */ +void vfio_register_irq_starter(int platform_bus_first_irq); + +#endif /*HW_VFIO_VFIO_PLATFORM_H*/ diff --git a/trace-events b/trace-events index 255971a..54d998c 100644 --- a/trace-events +++ b/trace-events @@ -1428,6 +1428,18 @@ vfio_put_group(int fd) "close group->fd=%d" vfio_get_device(const char * name, unsigned int flags, unsigned int num_regions, unsigned int num_irqs) "Device %s flags: %u, regions: %u, irqs: %u" vfio_put_base_device(int fd) "close vdev->fd=%d" +# hw/vfio/platform.c +vfio_platform_eoi(int pin, int fd) "EOI IRQ pin %d (fd=%d)" +vfio_platform_mmap_set_enabled(bool enabled) "fast path = %d" +vfio_platform_intp_mmap_enable(int pin) "IRQ #%d still active, stay in slow path" +vfio_platform_intp_interrupt(int pin, int fd) "Handle IRQ #%d (fd = %d)" +vfio_platform_populate_interrupts(int pin, int count, int flags) "- IRQ index %d: count %d, flags=0x%x" +vfio_platform_populate_regions(int region_index, unsigned long flag, unsigned long size, int fd, unsigned long offset) "- region %d flags = 0x%lx, size = 0x%lx, fd= %d, offset = 0x%lx" +vfio_platform_base_device_init(char *name, int groupid) "%s belongs to group #%d" +vfio_platform_realize(char *name, char *compat) "vfio device %s, compat = %s" +vfio_intp_interrupt_set_pending(int index) "irq %d is set PENDING" +vfio_platform_eoi_handle_pending(int index) "handle PENDING IRQ %d" + #hw/acpi/memory_hotplug.c mhp_acpi_invalid_slot_selected(uint32_t slot) "0x%"PRIx32 mhp_acpi_read_addr_lo(uint32_t slot, uint32_t addr) "slot[0x%"PRIx32"] addr lo: 0x%"PRIx32