From patchwork Fri Sep 11 12:46:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E73B8C433E2 for ; Fri, 11 Sep 2020 16:46:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8026F221ED for ; Fri, 11 Sep 2020 16:46:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599842796; bh=+GANnE8x2AjIJ0n2g8xz0D6nOtdB9ZiU2pb8fbn4/8o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=N+HZxNWdMfPTzeXRcgasixILw+Cef9iLdoI3L5KY1yFx2t2LsjNbYXRwSs4nLsQy6 Sdi5INZMVy5tHuEPym6PiBD+16rD8xWprCJiHk8ltllQ4LEkU9ujGu3VGGabtDApXL IqORCVMygQ7r1a6CF1emARSa5hhL53tfKlWnCwTE= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726383AbgIKQq3 (ORCPT ); Fri, 11 Sep 2020 12:46:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:49580 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726370AbgIKPIE (ORCPT ); Fri, 11 Sep 2020 11:08:04 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 8E462223FB; Fri, 11 Sep 2020 12:59:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829145; bh=+GANnE8x2AjIJ0n2g8xz0D6nOtdB9ZiU2pb8fbn4/8o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JtrIUVBq2ENAJHFb9J2EwbhirDMcATTn/K49jwUsjF/BchIPFnTPrwHWF0FEUvyCx yFv9Y23uFs0LQeJVtqQhAAkv4UMuwc7s8jRtB7bdsay5AwsLDdDthxuRlhY7h+tl5k RAN0tQKITBNE2jxRPXCl94/AtPoxKxPnqvhuOpGo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, OGAWA Hirofumi , Christoph Hellwig , Jens Axboe , Sasha Levin Subject: [PATCH 4.14 02/12] block: ensure bdi->io_pages is always initialized Date: Fri, 11 Sep 2020 14:46:56 +0200 Message-Id: <20200911122458.540739890@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jens Axboe [ Upstream commit de1b0ee490eafdf65fac9eef9925391a8369f2dc ] If a driver leaves the limit settings as the defaults, then we don't initialize bdi->io_pages. This means that file systems may need to work around bdi->io_pages == 0, which is somewhat messy. Initialize the default value just like we do for ->ra_pages. Cc: stable@vger.kernel.org Fixes: 9491ae4aade6 ("mm: don't cap request size based on read-ahead setting") Reported-by: OGAWA Hirofumi Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin --- block/blk-core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/blk-core.c b/block/blk-core.c index 4e04c79aa2c27..2407c898ba7d8 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -852,6 +852,8 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id) q->backing_dev_info->ra_pages = (VM_MAX_READAHEAD * 1024) / PAGE_SIZE; + q->backing_dev_info->io_pages = + (VM_MAX_READAHEAD * 1024) / PAGE_SIZE; q->backing_dev_info->capabilities = BDI_CAP_CGROUP_WRITEBACK; q->backing_dev_info->name = "block"; q->node = node_id; From patchwork Fri Sep 11 12:46:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264095 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B284C433E2 for ; Fri, 11 Sep 2020 13:45:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 06A14206CA for ; Fri, 11 Sep 2020 13:45:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599831949; bh=m9tXc2nE7B4HdiF9Mv8gCgkhJ7eDPB67iMoOyQxPMuk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=PZnDUxvcRPrbqcrYRr6Go/XxrI/T4zj6TDsIzcHpJFTkIbjyxENABa7eA88loYEct kM+A4dPswskmsM/gCyZLA5gu2UQ3CaMCe+HwSwjEtKgkcuUYrXWGot3aVm3bisT6l+ BcgLjjyPMa+VW4Q+Y/R4/vf+frRJezFEIReEoIdM= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725830AbgIKNpk (ORCPT ); Fri, 11 Sep 2020 09:45:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:34932 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726281AbgIKN2X (ORCPT ); Fri, 11 Sep 2020 09:28:23 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4A6C0223FD; Fri, 11 Sep 2020 12:59:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829147; bh=m9tXc2nE7B4HdiF9Mv8gCgkhJ7eDPB67iMoOyQxPMuk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DLJXxoo1piSDFfg60UCIm0QIm9HQMd8EFwho2MmH8ThN3MzphB0TTWgWietHftFy+ EHzzxYKLeB/7DVAvjRM/2uxrJLXFJ6X4TyWGXBcJPDuKFtB0bE4My5EzcjRMwc0Xcx bX8mAwtiP0m6ThA6Q+GGxuusCR5MTnz+/moSa1KA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Peter Xu , Alex Williamson , Ajay Kaher Subject: [PATCH 4.14 03/12] vfio/type1: Support faulting PFNMAP vmas Date: Fri, 11 Sep 2020 14:46:57 +0200 Message-Id: <20200911122458.588557479@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Alex Williamson commit 41311242221e3482b20bfed10fa4d9db98d87016 upstream. With conversion to follow_pfn(), DMA mapping a PFNMAP range depends on the range being faulted into the vma. Add support to manually provide that, in the same way as done on KVM with hva_to_pfn_remapped(). Reviewed-by: Peter Xu Signed-off-by: Alex Williamson [Ajay: Regenerated the patch for v4.14] Signed-off-by: Ajay Kaher Signed-off-by: Greg Kroah-Hartman --- drivers/vfio/vfio_iommu_type1.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-) --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -336,6 +336,32 @@ static int put_pfn(unsigned long pfn, in return 0; } +static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, + unsigned long vaddr, unsigned long *pfn, + bool write_fault) +{ + int ret; + + ret = follow_pfn(vma, vaddr, pfn); + if (ret) { + bool unlocked = false; + + ret = fixup_user_fault(NULL, mm, vaddr, + FAULT_FLAG_REMOTE | + (write_fault ? FAULT_FLAG_WRITE : 0), + &unlocked); + if (unlocked) + return -EAGAIN; + + if (ret) + return ret; + + ret = follow_pfn(vma, vaddr, pfn); + } + + return ret; +} + static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, int prot, unsigned long *pfn) { @@ -375,12 +401,16 @@ static int vaddr_get_pfn(struct mm_struc down_read(&mm->mmap_sem); +retry: vma = find_vma_intersection(mm, vaddr, vaddr + 1); if (vma && vma->vm_flags & VM_PFNMAP) { - if (!follow_pfn(vma, vaddr, pfn) && - is_invalid_reserved_pfn(*pfn)) - ret = 0; + ret = follow_fault_pfn(vma, mm, vaddr, pfn, prot & IOMMU_WRITE); + if (ret == -EAGAIN) + goto retry; + + if (!ret && !is_invalid_reserved_pfn(*pfn)) + ret = -EFAULT; } up_read(&mm->mmap_sem); From patchwork Fri Sep 11 12:46:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF3CFC43461 for ; Fri, 11 Sep 2020 16:55:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8470F221E7 for ; Fri, 11 Sep 2020 16:55:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599843324; bh=UoDf2Uwl0Yq8QLtFZRUy0JZ7qxXffEpxG3NsTmdAtDo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=bsWbgTYWf3lVxHtnMyuQd/XLNCvScvKgA6IBlI2QqI2wUvYJXImoHRpLliE4nz/Xp eowrGaJYCb9En7pgdXZ/qvfNX1KM2hZhmE2t0Ne2NeUPp4fLNMqm9pxwArbXLyItbD LZQJAr/5oX9jDGnS4jhQiOdMt6J3ChWfSS6y6xG8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726352AbgIKQzU (ORCPT ); Fri, 11 Sep 2020 12:55:20 -0400 Received: from mail.kernel.org ([198.145.29.99]:46954 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726323AbgIKPFa (ORCPT ); Fri, 11 Sep 2020 11:05:30 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D239622403; Fri, 11 Sep 2020 12:59:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829150; bh=UoDf2Uwl0Yq8QLtFZRUy0JZ7qxXffEpxG3NsTmdAtDo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nSU5ge54SG1KJ3couPPLdM8Tk+UWUa2wrEmqwWOR4sH2Ki2jAxazGowAa47bkZI0B +uVoc4xTAuHDyc6YuF7Ee7R3kIM0JDFv4O8IshUQrp59XmEOsqws5KVmnIOMfdMr3M I8CxE+1EIneWsrIqCu9kIDtLwQQpqyMIpSdauf3U= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Peter Xu , Alex Williamson , Ajay Kaher Subject: [PATCH 4.14 04/12] vfio-pci: Fault mmaps to enable vma tracking Date: Fri, 11 Sep 2020 14:46:58 +0200 Message-Id: <20200911122458.637277425@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Alex Williamson commit 11c4cd07ba111a09f49625f9e4c851d83daf0a22 upstream. Rather than calling remap_pfn_range() when a region is mmap'd, setup a vm_ops handler to support dynamic faulting of the range on access. This allows us to manage a list of vmas actively mapping the area that we can later use to invalidate those mappings. The open callback invalidates the vma range so that all tracking is inserted in the fault handler and removed in the close handler. Reviewed-by: Peter Xu Signed-off-by: Alex Williamson [Ajay: Regenerated the patch for v4.14] Signed-off-by: Ajay Kaher Signed-off-by: Greg Kroah-Hartman --- drivers/vfio/pci/vfio_pci.c | 76 +++++++++++++++++++++++++++++++++++- drivers/vfio/pci/vfio_pci_private.h | 7 +++ 2 files changed, 81 insertions(+), 2 deletions(-) --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -1120,6 +1120,70 @@ static ssize_t vfio_pci_write(void *devi return vfio_pci_rw(device_data, (char __user *)buf, count, ppos, true); } +static int vfio_pci_add_vma(struct vfio_pci_device *vdev, + struct vm_area_struct *vma) +{ + struct vfio_pci_mmap_vma *mmap_vma; + + mmap_vma = kmalloc(sizeof(*mmap_vma), GFP_KERNEL); + if (!mmap_vma) + return -ENOMEM; + + mmap_vma->vma = vma; + + mutex_lock(&vdev->vma_lock); + list_add(&mmap_vma->vma_next, &vdev->vma_list); + mutex_unlock(&vdev->vma_lock); + + return 0; +} + +/* + * Zap mmaps on open so that we can fault them in on access and therefore + * our vma_list only tracks mappings accessed since last zap. + */ +static void vfio_pci_mmap_open(struct vm_area_struct *vma) +{ + zap_vma_ptes(vma, vma->vm_start, vma->vm_end - vma->vm_start); +} + +static void vfio_pci_mmap_close(struct vm_area_struct *vma) +{ + struct vfio_pci_device *vdev = vma->vm_private_data; + struct vfio_pci_mmap_vma *mmap_vma; + + mutex_lock(&vdev->vma_lock); + list_for_each_entry(mmap_vma, &vdev->vma_list, vma_next) { + if (mmap_vma->vma == vma) { + list_del(&mmap_vma->vma_next); + kfree(mmap_vma); + break; + } + } + mutex_unlock(&vdev->vma_lock); +} + +static int vfio_pci_mmap_fault(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + struct vfio_pci_device *vdev = vma->vm_private_data; + + if (vfio_pci_add_vma(vdev, vma)) + return VM_FAULT_OOM; + + if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, + vma->vm_end - vma->vm_start, vma->vm_page_prot)) + return VM_FAULT_SIGBUS; + + return VM_FAULT_NOPAGE; +} + +static const struct vm_operations_struct vfio_pci_mmap_ops = { + .open = vfio_pci_mmap_open, + .close = vfio_pci_mmap_close, + .fault = vfio_pci_mmap_fault, +}; + static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma) { struct vfio_pci_device *vdev = device_data; @@ -1185,8 +1249,14 @@ static int vfio_pci_mmap(void *device_da vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); vma->vm_pgoff = (pci_resource_start(pdev, index) >> PAGE_SHIFT) + pgoff; - return remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, - req_len, vma->vm_page_prot); + /* + * See remap_pfn_range(), called from vfio_pci_fault() but we can't + * change vm_flags within the fault handler. Set them now. + */ + vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + vma->vm_ops = &vfio_pci_mmap_ops; + + return 0; } static void vfio_pci_request(void *device_data, unsigned int count) @@ -1243,6 +1313,8 @@ static int vfio_pci_probe(struct pci_dev vdev->irq_type = VFIO_PCI_NUM_IRQS; mutex_init(&vdev->igate); spin_lock_init(&vdev->irqlock); + mutex_init(&vdev->vma_lock); + INIT_LIST_HEAD(&vdev->vma_list); ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev); if (ret) { --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -63,6 +63,11 @@ struct vfio_pci_dummy_resource { struct list_head res_next; }; +struct vfio_pci_mmap_vma { + struct vm_area_struct *vma; + struct list_head vma_next; +}; + struct vfio_pci_device { struct pci_dev *pdev; void __iomem *barmap[PCI_STD_RESOURCE_END + 1]; @@ -95,6 +100,8 @@ struct vfio_pci_device { struct eventfd_ctx *err_trigger; struct eventfd_ctx *req_trigger; struct list_head dummy_resources_list; + struct mutex vma_lock; + struct list_head vma_list; }; #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX) From patchwork Fri Sep 11 12:46:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE45FC433E2 for ; Fri, 11 Sep 2020 16:49:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5BB8D221EB for ; Fri, 11 Sep 2020 16:49:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599842979; bh=UKYVEVaimp0SfSK4RgeMjo8yFt7Z+scShmhY1tEgFhw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=sEURWMHVhy1SKXPvEROTszkZBkF6ONq/PwJAbeUInuTmJO71pdiRj5aGooKobzFzH yd+ZsCq6VOurm5q9vhZwyKmJVmPJWEDosY1ziL+7oBPD8mwAoh0gFBrtckvEr/SLYo ZlZEcmAhosVHdkqQcS8KdrD4S4MjSq6TzUto1VTA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726158AbgIKQtC (ORCPT ); Fri, 11 Sep 2020 12:49:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:47882 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726353AbgIKPG2 (ORCPT ); Fri, 11 Sep 2020 11:06:28 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 736A422404; Fri, 11 Sep 2020 12:59:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829153; bh=UKYVEVaimp0SfSK4RgeMjo8yFt7Z+scShmhY1tEgFhw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l45VyBEZ33uzdu+qI5zwgaePXCoPkVY++CEd91OENfMbyp/flh0AlBBMKu5APy/GP xZl88X482xuXgazTO+Gkx7gp66VBEekQYc2ksw46tmEnIDTn4/ulV0WBeQ6jrkqKSx mZzLzH0fzl40AwLke8C+4ezHKuJMqsqKP7j43xHE= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Peter Xu , Alex Williamson , Ajay Kaher Subject: [PATCH 4.14 05/12] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory Date: Fri, 11 Sep 2020 14:46:59 +0200 Message-Id: <20200911122458.688827871@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Alex Williamson commit abafbc551fddede3e0a08dee1dcde08fc0eb8476 upstream. Accessing the disabled memory space of a PCI device would typically result in a master abort response on conventional PCI, or an unsupported request on PCI express. The user would generally see these as a -1 response for the read return data and the write would be silently discarded, possibly with an uncorrected, non-fatal AER error triggered on the host. Some systems however take it upon themselves to bring down the entire system when they see something that might indicate a loss of data, such as this discarded write to a disabled memory space. To avoid this, we want to try to block the user from accessing memory spaces while they're disabled. We start with a semaphore around the memory enable bit, where writers modify the memory enable state and must be serialized, while readers make use of the memory region and can access in parallel. Writers include both direct manipulation via the command register, as well as any reset path where the internal mechanics of the reset may both explicitly and implicitly disable memory access, and manipulation of the MSI-X configuration, where the MSI-X vector table resides in MMIO space of the device. Readers include the read and write file ops to access the vfio device fd offsets as well as memory mapped access. In the latter case, we make use of our new vma list support to zap, or invalidate, those memory mappings in order to force them to be faulted back in on access. Our semaphore usage will stall user access to MMIO spaces across internal operations like reset, but the user might experience new behavior when trying to access the MMIO space while disabled via the PCI command register. Access via read or write while disabled will return -EIO and access via memory maps will result in a SIGBUS. This is expected to be compatible with known use cases and potentially provides better error handling capabilities than present in the hardware, while avoiding the more readily accessible and severe platform error responses that might otherwise occur. Fixes: CVE-2020-12888 Reviewed-by: Peter Xu Signed-off-by: Alex Williamson [Ajay: Regenerated the patch for v4.14] Signed-off-by: Ajay Kaher Signed-off-by: Greg Kroah-Hartman --- drivers/vfio/pci/vfio_pci.c | 294 +++++++++++++++++++++++++++++++----- drivers/vfio/pci/vfio_pci_config.c | 36 +++- drivers/vfio/pci/vfio_pci_intrs.c | 14 + drivers/vfio/pci/vfio_pci_private.h | 9 + drivers/vfio/pci/vfio_pci_rdwr.c | 29 ++- 5 files changed, 334 insertions(+), 48 deletions(-) --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -29,6 +29,7 @@ #include #include #include +#include #include "vfio_pci_private.h" @@ -181,6 +182,7 @@ no_mmap: static void vfio_pci_try_bus_reset(struct vfio_pci_device *vdev); static void vfio_pci_disable(struct vfio_pci_device *vdev); +static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data); /* * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND @@ -644,6 +646,12 @@ int vfio_pci_register_dev_region(struct return 0; } +struct vfio_devices { + struct vfio_device **devices; + int cur_index; + int max_index; +}; + static long vfio_pci_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { @@ -717,7 +725,7 @@ static long vfio_pci_ioctl(void *device_ { void __iomem *io; size_t size; - u16 orig_cmd; + u16 cmd; info.offset = VFIO_PCI_INDEX_TO_OFFSET(info.index); info.flags = 0; @@ -737,10 +745,7 @@ static long vfio_pci_ioctl(void *device_ * Is it really there? Enable memory decode for * implicit access in pci_map_rom(). */ - pci_read_config_word(pdev, PCI_COMMAND, &orig_cmd); - pci_write_config_word(pdev, PCI_COMMAND, - orig_cmd | PCI_COMMAND_MEMORY); - + cmd = vfio_pci_memory_lock_and_enable(vdev); io = pci_map_rom(pdev, &size); if (io) { info.flags = VFIO_REGION_INFO_FLAG_READ; @@ -748,8 +753,8 @@ static long vfio_pci_ioctl(void *device_ } else { info.size = 0; } + vfio_pci_memory_unlock_and_restore(vdev, cmd); - pci_write_config_word(pdev, PCI_COMMAND, orig_cmd); break; } case VFIO_PCI_VGA_REGION_INDEX: @@ -885,8 +890,16 @@ static long vfio_pci_ioctl(void *device_ return ret; } else if (cmd == VFIO_DEVICE_RESET) { - return vdev->reset_works ? - pci_try_reset_function(vdev->pdev) : -EINVAL; + int ret; + + if (!vdev->reset_works) + return -EINVAL; + + vfio_pci_zap_and_down_write_memory_lock(vdev); + ret = pci_try_reset_function(vdev->pdev); + up_write(&vdev->memory_lock); + + return ret; } else if (cmd == VFIO_DEVICE_GET_PCI_HOT_RESET_INFO) { struct vfio_pci_hot_reset_info hdr; @@ -966,8 +979,9 @@ reset_info_exit: int32_t *group_fds; struct vfio_pci_group_entry *groups; struct vfio_pci_group_info info; + struct vfio_devices devs = { .cur_index = 0 }; bool slot = false; - int i, count = 0, ret = 0; + int i, group_idx, mem_idx = 0, count = 0, ret = 0; minsz = offsetofend(struct vfio_pci_hot_reset, count); @@ -1019,9 +1033,9 @@ reset_info_exit: * user interface and store the group and iommu ID. This * ensures the group is held across the reset. */ - for (i = 0; i < hdr.count; i++) { + for (group_idx = 0; group_idx < hdr.count; group_idx++) { struct vfio_group *group; - struct fd f = fdget(group_fds[i]); + struct fd f = fdget(group_fds[group_idx]); if (!f.file) { ret = -EBADF; break; @@ -1034,8 +1048,9 @@ reset_info_exit: break; } - groups[i].group = group; - groups[i].id = vfio_external_user_iommu_id(group); + groups[group_idx].group = group; + groups[group_idx].id = + vfio_external_user_iommu_id(group); } kfree(group_fds); @@ -1054,14 +1069,65 @@ reset_info_exit: ret = vfio_pci_for_each_slot_or_bus(vdev->pdev, vfio_pci_validate_devs, &info, slot); - if (!ret) - /* User has access, do the reset */ - ret = slot ? pci_try_reset_slot(vdev->pdev->slot) : - pci_try_reset_bus(vdev->pdev->bus); + + if (ret) + goto hot_reset_release; + + devs.max_index = count; + devs.devices = kcalloc(count, sizeof(struct vfio_device *), + GFP_KERNEL); + if (!devs.devices) { + ret = -ENOMEM; + goto hot_reset_release; + } + + /* + * We need to get memory_lock for each device, but devices + * can share mmap_sem, therefore we need to zap and hold + * the vma_lock for each device, and only then get each + * memory_lock. + */ + ret = vfio_pci_for_each_slot_or_bus(vdev->pdev, + vfio_pci_try_zap_and_vma_lock_cb, + &devs, slot); + if (ret) + goto hot_reset_release; + + for (; mem_idx < devs.cur_index; mem_idx++) { + struct vfio_pci_device *tmp; + + tmp = vfio_device_data(devs.devices[mem_idx]); + + ret = down_write_trylock(&tmp->memory_lock); + if (!ret) { + ret = -EBUSY; + goto hot_reset_release; + } + mutex_unlock(&tmp->vma_lock); + } + + /* User has access, do the reset */ + ret = slot ? pci_try_reset_slot(vdev->pdev->slot) : + pci_try_reset_bus(vdev->pdev->bus); hot_reset_release: - for (i--; i >= 0; i--) - vfio_group_put_external_user(groups[i].group); + for (i = 0; i < devs.cur_index; i++) { + struct vfio_device *device; + struct vfio_pci_device *tmp; + + device = devs.devices[i]; + tmp = vfio_device_data(device); + + if (i < mem_idx) + up_write(&tmp->memory_lock); + else + mutex_unlock(&tmp->vma_lock); + vfio_device_put(device); + } + kfree(devs.devices); + + for (group_idx--; group_idx >= 0; group_idx--) + vfio_group_put_external_user(groups[group_idx].group); kfree(groups); return ret; @@ -1120,8 +1186,126 @@ static ssize_t vfio_pci_write(void *devi return vfio_pci_rw(device_data, (char __user *)buf, count, ppos, true); } -static int vfio_pci_add_vma(struct vfio_pci_device *vdev, - struct vm_area_struct *vma) +/* Return 1 on zap and vma_lock acquired, 0 on contention (only with @try) */ +static int vfio_pci_zap_and_vma_lock(struct vfio_pci_device *vdev, bool try) +{ + struct vfio_pci_mmap_vma *mmap_vma, *tmp; + + /* + * Lock ordering: + * vma_lock is nested under mmap_sem for vm_ops callback paths. + * The memory_lock semaphore is used by both code paths calling + * into this function to zap vmas and the vm_ops.fault callback + * to protect the memory enable state of the device. + * + * When zapping vmas we need to maintain the mmap_sem => vma_lock + * ordering, which requires using vma_lock to walk vma_list to + * acquire an mm, then dropping vma_lock to get the mmap_sem and + * reacquiring vma_lock. This logic is derived from similar + * requirements in uverbs_user_mmap_disassociate(). + * + * mmap_sem must always be the top-level lock when it is taken. + * Therefore we can only hold the memory_lock write lock when + * vma_list is empty, as we'd need to take mmap_sem to clear + * entries. vma_list can only be guaranteed empty when holding + * vma_lock, thus memory_lock is nested under vma_lock. + * + * This enables the vm_ops.fault callback to acquire vma_lock, + * followed by memory_lock read lock, while already holding + * mmap_sem without risk of deadlock. + */ + while (1) { + struct mm_struct *mm = NULL; + + if (try) { + if (!mutex_trylock(&vdev->vma_lock)) + return 0; + } else { + mutex_lock(&vdev->vma_lock); + } + while (!list_empty(&vdev->vma_list)) { + mmap_vma = list_first_entry(&vdev->vma_list, + struct vfio_pci_mmap_vma, + vma_next); + mm = mmap_vma->vma->vm_mm; + if (mmget_not_zero(mm)) + break; + + list_del(&mmap_vma->vma_next); + kfree(mmap_vma); + mm = NULL; + } + if (!mm) + return 1; + mutex_unlock(&vdev->vma_lock); + + if (try) { + if (!down_read_trylock(&mm->mmap_sem)) { + mmput(mm); + return 0; + } + } else { + down_read(&mm->mmap_sem); + } + if (mmget_still_valid(mm)) { + if (try) { + if (!mutex_trylock(&vdev->vma_lock)) { + up_read(&mm->mmap_sem); + mmput(mm); + return 0; + } + } else { + mutex_lock(&vdev->vma_lock); + } + list_for_each_entry_safe(mmap_vma, tmp, + &vdev->vma_list, vma_next) { + struct vm_area_struct *vma = mmap_vma->vma; + + if (vma->vm_mm != mm) + continue; + + list_del(&mmap_vma->vma_next); + kfree(mmap_vma); + + zap_vma_ptes(vma, vma->vm_start, + vma->vm_end - vma->vm_start); + } + mutex_unlock(&vdev->vma_lock); + } + up_read(&mm->mmap_sem); + mmput(mm); + } +} + +void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_device *vdev) +{ + vfio_pci_zap_and_vma_lock(vdev, false); + down_write(&vdev->memory_lock); + mutex_unlock(&vdev->vma_lock); +} + +u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_device *vdev) +{ + u16 cmd; + + down_write(&vdev->memory_lock); + pci_read_config_word(vdev->pdev, PCI_COMMAND, &cmd); + if (!(cmd & PCI_COMMAND_MEMORY)) + pci_write_config_word(vdev->pdev, PCI_COMMAND, + cmd | PCI_COMMAND_MEMORY); + + return cmd; +} + +void vfio_pci_memory_unlock_and_restore(struct vfio_pci_device *vdev, u16 cmd) +{ + pci_write_config_word(vdev->pdev, PCI_COMMAND, cmd); + up_write(&vdev->memory_lock); +} + +/* Caller holds vma_lock */ +static int __vfio_pci_add_vma(struct vfio_pci_device *vdev, + struct vm_area_struct *vma) { struct vfio_pci_mmap_vma *mmap_vma; @@ -1130,10 +1314,7 @@ static int vfio_pci_add_vma(struct vfio_ return -ENOMEM; mmap_vma->vma = vma; - - mutex_lock(&vdev->vma_lock); list_add(&mmap_vma->vma_next, &vdev->vma_list); - mutex_unlock(&vdev->vma_lock); return 0; } @@ -1167,15 +1348,32 @@ static int vfio_pci_mmap_fault(struct vm { struct vm_area_struct *vma = vmf->vma; struct vfio_pci_device *vdev = vma->vm_private_data; + int ret = VM_FAULT_NOPAGE; + + mutex_lock(&vdev->vma_lock); + down_read(&vdev->memory_lock); + + if (!__vfio_pci_memory_enabled(vdev)) { + ret = VM_FAULT_SIGBUS; + mutex_unlock(&vdev->vma_lock); + goto up_out; + } + + if (__vfio_pci_add_vma(vdev, vma)) { + ret = VM_FAULT_OOM; + mutex_unlock(&vdev->vma_lock); + goto up_out; + } - if (vfio_pci_add_vma(vdev, vma)) - return VM_FAULT_OOM; + mutex_unlock(&vdev->vma_lock); if (remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, vma->vm_page_prot)) - return VM_FAULT_SIGBUS; + ret = VM_FAULT_SIGBUS; - return VM_FAULT_NOPAGE; +up_out: + up_read(&vdev->memory_lock); + return ret; } static const struct vm_operations_struct vfio_pci_mmap_ops = { @@ -1315,6 +1513,7 @@ static int vfio_pci_probe(struct pci_dev spin_lock_init(&vdev->irqlock); mutex_init(&vdev->vma_lock); INIT_LIST_HEAD(&vdev->vma_list); + init_rwsem(&vdev->memory_lock); ret = vfio_add_group_dev(&pdev->dev, &vfio_pci_ops, vdev); if (ret) { @@ -1409,12 +1608,6 @@ static struct pci_driver vfio_pci_driver .err_handler = &vfio_err_handlers, }; -struct vfio_devices { - struct vfio_device **devices; - int cur_index; - int max_index; -}; - static int vfio_pci_get_devs(struct pci_dev *pdev, void *data) { struct vfio_devices *devs = data; @@ -1431,6 +1624,39 @@ static int vfio_pci_get_devs(struct pci_ vfio_device_put(device); return -EBUSY; } + + devs->devices[devs->cur_index++] = device; + return 0; +} + +static int vfio_pci_try_zap_and_vma_lock_cb(struct pci_dev *pdev, void *data) +{ + struct vfio_devices *devs = data; + struct vfio_device *device; + struct vfio_pci_device *vdev; + + if (devs->cur_index == devs->max_index) + return -ENOSPC; + + device = vfio_device_get_from_dev(&pdev->dev); + if (!device) + return -EINVAL; + + if (pci_dev_driver(pdev) != &vfio_pci_driver) { + vfio_device_put(device); + return -EBUSY; + } + + vdev = vfio_device_data(device); + + /* + * Locking multiple devices is prone to deadlock, runaway and + * unwind if we hit contention. + */ + if (!vfio_pci_zap_and_vma_lock(vdev, true)) { + vfio_device_put(device); + return -EBUSY; + } devs->devices[devs->cur_index++] = device; return 0; --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -398,6 +398,14 @@ static inline void p_setd(struct perm_bi *(__le32 *)(&p->write[off]) = cpu_to_le32(write); } +/* Caller should hold memory_lock semaphore */ +bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev) +{ + u16 cmd = le16_to_cpu(*(__le16 *)&vdev->vconfig[PCI_COMMAND]); + + return cmd & PCI_COMMAND_MEMORY; +} + /* * Restore the *real* BARs after we detect a FLR or backdoor reset. * (backdoor = some device specific technique that we didn't catch) @@ -558,13 +566,18 @@ static int vfio_basic_config_write(struc new_cmd = le32_to_cpu(val); + phys_io = !!(phys_cmd & PCI_COMMAND_IO); + virt_io = !!(le16_to_cpu(*virt_cmd) & PCI_COMMAND_IO); + new_io = !!(new_cmd & PCI_COMMAND_IO); + phys_mem = !!(phys_cmd & PCI_COMMAND_MEMORY); virt_mem = !!(le16_to_cpu(*virt_cmd) & PCI_COMMAND_MEMORY); new_mem = !!(new_cmd & PCI_COMMAND_MEMORY); - phys_io = !!(phys_cmd & PCI_COMMAND_IO); - virt_io = !!(le16_to_cpu(*virt_cmd) & PCI_COMMAND_IO); - new_io = !!(new_cmd & PCI_COMMAND_IO); + if (!new_mem) + vfio_pci_zap_and_down_write_memory_lock(vdev); + else + down_write(&vdev->memory_lock); /* * If the user is writing mem/io enable (new_mem/io) and we @@ -581,8 +594,11 @@ static int vfio_basic_config_write(struc } count = vfio_default_config_write(vdev, pos, count, perm, offset, val); - if (count < 0) + if (count < 0) { + if (offset == PCI_COMMAND) + up_write(&vdev->memory_lock); return count; + } /* * Save current memory/io enable bits in vconfig to allow for @@ -593,6 +609,8 @@ static int vfio_basic_config_write(struc *virt_cmd &= cpu_to_le16(~mask); *virt_cmd |= cpu_to_le16(new_cmd & mask); + + up_write(&vdev->memory_lock); } /* Emulate INTx disable */ @@ -830,8 +848,11 @@ static int vfio_exp_config_write(struct pos - offset + PCI_EXP_DEVCAP, &cap); - if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) + if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) { + vfio_pci_zap_and_down_write_memory_lock(vdev); pci_try_reset_function(vdev->pdev); + up_write(&vdev->memory_lock); + } } /* @@ -909,8 +930,11 @@ static int vfio_af_config_write(struct v pos - offset + PCI_AF_CAP, &cap); - if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) + if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) { + vfio_pci_zap_and_down_write_memory_lock(vdev); pci_try_reset_function(vdev->pdev); + up_write(&vdev->memory_lock); + } } return count; --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -252,6 +252,7 @@ static int vfio_msi_enable(struct vfio_p struct pci_dev *pdev = vdev->pdev; unsigned int flag = msix ? PCI_IRQ_MSIX : PCI_IRQ_MSI; int ret; + u16 cmd; if (!is_irq_none(vdev)) return -EINVAL; @@ -261,13 +262,16 @@ static int vfio_msi_enable(struct vfio_p return -ENOMEM; /* return the number of supported vectors if we can't get all: */ + cmd = vfio_pci_memory_lock_and_enable(vdev); ret = pci_alloc_irq_vectors(pdev, 1, nvec, flag); if (ret < nvec) { if (ret > 0) pci_free_irq_vectors(pdev); + vfio_pci_memory_unlock_and_restore(vdev, cmd); kfree(vdev->ctx); return ret; } + vfio_pci_memory_unlock_and_restore(vdev, cmd); vdev->num_ctx = nvec; vdev->irq_type = msix ? VFIO_PCI_MSIX_IRQ_INDEX : @@ -290,6 +294,7 @@ static int vfio_msi_set_vector_signal(st struct pci_dev *pdev = vdev->pdev; struct eventfd_ctx *trigger; int irq, ret; + u16 cmd; if (vector < 0 || vector >= vdev->num_ctx) return -EINVAL; @@ -298,7 +303,11 @@ static int vfio_msi_set_vector_signal(st if (vdev->ctx[vector].trigger) { irq_bypass_unregister_producer(&vdev->ctx[vector].producer); + + cmd = vfio_pci_memory_lock_and_enable(vdev); free_irq(irq, vdev->ctx[vector].trigger); + vfio_pci_memory_unlock_and_restore(vdev, cmd); + kfree(vdev->ctx[vector].name); eventfd_ctx_put(vdev->ctx[vector].trigger); vdev->ctx[vector].trigger = NULL; @@ -326,6 +335,7 @@ static int vfio_msi_set_vector_signal(st * such a reset it would be unsuccessful. To avoid this, restore the * cached value of the message prior to enabling. */ + cmd = vfio_pci_memory_lock_and_enable(vdev); if (msix) { struct msi_msg msg; @@ -335,6 +345,7 @@ static int vfio_msi_set_vector_signal(st ret = request_irq(irq, vfio_msihandler, 0, vdev->ctx[vector].name, trigger); + vfio_pci_memory_unlock_and_restore(vdev, cmd); if (ret) { kfree(vdev->ctx[vector].name); eventfd_ctx_put(trigger); @@ -379,6 +390,7 @@ static void vfio_msi_disable(struct vfio { struct pci_dev *pdev = vdev->pdev; int i; + u16 cmd; for (i = 0; i < vdev->num_ctx; i++) { vfio_virqfd_disable(&vdev->ctx[i].unmask); @@ -387,7 +399,9 @@ static void vfio_msi_disable(struct vfio vfio_msi_set_block(vdev, 0, vdev->num_ctx, NULL, msix); + cmd = vfio_pci_memory_lock_and_enable(vdev); pci_free_irq_vectors(pdev); + vfio_pci_memory_unlock_and_restore(vdev, cmd); /* * Both disable paths above use pci_intx_for_msi() to clear DisINTx --- a/drivers/vfio/pci/vfio_pci_private.h +++ b/drivers/vfio/pci/vfio_pci_private.h @@ -102,6 +102,7 @@ struct vfio_pci_device { struct list_head dummy_resources_list; struct mutex vma_lock; struct list_head vma_list; + struct rw_semaphore memory_lock; }; #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX) @@ -137,6 +138,14 @@ extern int vfio_pci_register_dev_region( unsigned int type, unsigned int subtype, const struct vfio_pci_regops *ops, size_t size, u32 flags, void *data); + +extern bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev); +extern void vfio_pci_zap_and_down_write_memory_lock(struct vfio_pci_device + *vdev); +extern u16 vfio_pci_memory_lock_and_enable(struct vfio_pci_device *vdev); +extern void vfio_pci_memory_unlock_and_restore(struct vfio_pci_device *vdev, + u16 cmd); + #ifdef CONFIG_VFIO_PCI_IGD extern int vfio_pci_igd_init(struct vfio_pci_device *vdev); #else --- a/drivers/vfio/pci/vfio_pci_rdwr.c +++ b/drivers/vfio/pci/vfio_pci_rdwr.c @@ -122,6 +122,7 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_ size_t x_start = 0, x_end = 0; resource_size_t end; void __iomem *io; + struct resource *res = &vdev->pdev->resource[bar]; ssize_t done; if (pci_resource_start(pdev, bar)) @@ -137,6 +138,14 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_ count = min(count, (size_t)(end - pos)); + if (res->flags & IORESOURCE_MEM) { + down_read(&vdev->memory_lock); + if (!__vfio_pci_memory_enabled(vdev)) { + up_read(&vdev->memory_lock); + return -EIO; + } + } + if (bar == PCI_ROM_RESOURCE) { /* * The ROM can fill less space than the BAR, so we start the @@ -144,20 +153,21 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_ * filling large ROM BARs much faster. */ io = pci_map_rom(pdev, &x_start); - if (!io) - return -ENOMEM; + if (!io) { + done = -ENOMEM; + goto out; + } x_end = end; } else if (!vdev->barmap[bar]) { - int ret; - - ret = pci_request_selected_regions(pdev, 1 << bar, "vfio"); - if (ret) - return ret; + done = pci_request_selected_regions(pdev, 1 << bar, "vfio"); + if (done) + goto out; io = pci_iomap(pdev, bar, 0); if (!io) { pci_release_selected_regions(pdev, 1 << bar); - return -ENOMEM; + done = -ENOMEM; + goto out; } vdev->barmap[bar] = io; @@ -176,6 +186,9 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_ if (bar == PCI_ROM_RESOURCE) pci_unmap_rom(pdev, io); +out: + if (res->flags & IORESOURCE_MEM) + up_read(&vdev->memory_lock); return done; } From patchwork Fri Sep 11 12:47:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264057 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF93FC43461 for ; Fri, 11 Sep 2020 16:57:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 961A22074B for ; Fri, 11 Sep 2020 16:57:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599843462; bh=Z9D407PuPDjKdFeYg8ua1Pb572B3W8AuPKajpC6UwCM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=Ti+FlCo5JeJOqy14cC3mmwtUHzsci80UvAnoRtbk47NcjMjEJV+AtQMb38rlWqKn2 BrylirvtElZrv0o6aud01p20ZYpQlZJ3IexlFZlcotrd6wLVBTGE6Wfl6PFEHXpqBs 0DyepZyHXE6iGPirtxcL/6S8JtuvQ7T9aIz2OsDs= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725913AbgIKQ5j (ORCPT ); Fri, 11 Sep 2020 12:57:39 -0400 Received: from mail.kernel.org ([198.145.29.99]:46952 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726322AbgIKPFa (ORCPT ); Fri, 11 Sep 2020 11:05:30 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2AA482242C; Fri, 11 Sep 2020 12:59:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829155; bh=Z9D407PuPDjKdFeYg8ua1Pb572B3W8AuPKajpC6UwCM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=huEQ0yWxgXA5dl/DDwnfHdO5mx6prQbaoJ86tBN947N07ETgcsAJBMaNrWhJDqC5L 5ksTwvay2SEz2M7zYWQXu06XAjJLhz9bJhvZqh1khA7zKJcGqvTVVmoB53qW/8QSt0 werPm0bvy1KCaC8PEWyWAE/VMaBwASoVNsj52Jjk= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Alex Williamson Subject: [PATCH 4.14 06/12] vfio/pci: Fix SR-IOV VF handling with MMIO blocking Date: Fri, 11 Sep 2020 14:47:00 +0200 Message-Id: <20200911122458.729118580@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Alex Williamson commit ebfa440ce38b7e2e04c3124aa89c8a9f4094cf21 upstream. SR-IOV VFs do not implement the memory enable bit of the command register, therefore this bit is not set in config space after pci_enable_device(). This leads to an unintended difference between PF and VF in hand-off state to the user. We can correct this by setting the initial value of the memory enable bit in our virtualized config space. There's really no need however to ever fault a user on a VF though as this would only indicate an error in the user's management of the enable bit, versus a PF where the same access could trigger hardware faults. Fixes: abafbc551fdd ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Alex Williamson Signed-off-by: Greg Kroah-Hartman --- drivers/vfio/pci/vfio_pci_config.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -401,9 +401,15 @@ static inline void p_setd(struct perm_bi /* Caller should hold memory_lock semaphore */ bool __vfio_pci_memory_enabled(struct vfio_pci_device *vdev) { + struct pci_dev *pdev = vdev->pdev; u16 cmd = le16_to_cpu(*(__le16 *)&vdev->vconfig[PCI_COMMAND]); - return cmd & PCI_COMMAND_MEMORY; + /* + * SR-IOV VF memory enable is handled by the MSE bit in the + * PF SR-IOV capability, there's therefore no need to trigger + * faults based on the virtual value. + */ + return pdev->is_virtfn || (cmd & PCI_COMMAND_MEMORY); } /* @@ -1732,6 +1738,15 @@ int vfio_config_init(struct vfio_pci_dev vconfig[PCI_INTERRUPT_PIN]); vconfig[PCI_INTERRUPT_PIN] = 0; /* Gratuitous for good VFs */ + + /* + * VFs do no implement the memory enable bit of the COMMAND + * register therefore we'll not have it set in our initial + * copy of config space after pci_enable_device(). For + * consistency with PFs, set the virtual enable bit here. + */ + *(__le16 *)&vconfig[PCI_COMMAND] |= + cpu_to_le16(PCI_COMMAND_MEMORY); } if (!IS_ENABLED(CONFIG_VFIO_PCI_INTX) || vdev->nointx) From patchwork Fri Sep 11 12:47:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 306DFC43461 for ; Fri, 11 Sep 2020 16:43:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D1475221EB for ; Fri, 11 Sep 2020 16:43:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599842601; bh=ruSye/Xa0SxJovHShAZlpdR9ElD2kkZvl22+l4KTbsk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=OiMFwCboC4L1ymZ8PEXlHq/YjIhlfNvWyeOUbEp38XrxBU7/sSc0DX0/vEIrgBRa1 df/9zVP+HEuv9JRjenaCxkXyEEnA1mIKYKEMMjo2yAvdxn9K1SqeaFbn0dVSBL3zfY xipEXqtFNqpf9GT4M1FiWqdZJaT0qS2WiTDWZkAg= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726394AbgIKQnT (ORCPT ); Fri, 11 Sep 2020 12:43:19 -0400 Received: from mail.kernel.org ([198.145.29.99]:47884 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726392AbgIKPI2 (ORCPT ); Fri, 11 Sep 2020 11:08:28 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 896222242E; Fri, 11 Sep 2020 12:59:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829158; bh=ruSye/Xa0SxJovHShAZlpdR9ElD2kkZvl22+l4KTbsk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HPqx8QNuxcROx0RNbeWT5Jj6Dt05mZtao/0Ba5CAcxuGxwxULRfmI2vznE8DX8A3y nqJUF21D5HjtI7X9tUI0XXs6ZakUfdTCluF8dVwvft2upnMQbla4P7rXBMrUAkdyDr 5HmIKXmhvG8kIlpGJf3W7Scqn0ILwz/3xrJev2dY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Rob Sherwood , Jakub Kicinski , Michael Chan , "David S. Miller" Subject: [PATCH 4.14 07/12] bnxt: dont enable NAPI until rings are ready Date: Fri, 11 Sep 2020 14:47:01 +0200 Message-Id: <20200911122458.776047210@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jakub Kicinski commit 96ecdcc992eb7f468b2cf829b0f5408a1fad4668 upstream. Netpoll can try to poll napi as soon as napi_enable() is called. It crashes trying to access a doorbell which is still NULL: BUG: kernel NULL pointer dereference, address: 0000000000000000 CPU: 59 PID: 6039 Comm: ethtool Kdump: loaded Tainted: G S 5.9.0-rc1-00469-g5fd99b5d9950-dirty #26 RIP: 0010:bnxt_poll+0x121/0x1c0 Code: c4 20 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 41 8b 86 a0 01 00 00 41 23 85 18 01 00 00 49 8b 96 a8 01 00 00 0d 00 00 00 24 <89> 02 41 f6 45 77 02 74 cb 49 8b ae d8 01 00 00 31 c0 c7 44 24 1a netpoll_poll_dev+0xbd/0x1a0 __netpoll_send_skb+0x1b2/0x210 netpoll_send_udp+0x2c9/0x406 write_ext_msg+0x1d7/0x1f0 console_unlock+0x23c/0x520 vprintk_emit+0xe0/0x1d0 printk+0x58/0x6f x86_vector_activate.cold+0xf/0x46 __irq_domain_activate_irq+0x50/0x80 __irq_domain_activate_irq+0x32/0x80 __irq_domain_activate_irq+0x32/0x80 irq_domain_activate_irq+0x25/0x40 __setup_irq+0x2d2/0x700 request_threaded_irq+0xfb/0x160 __bnxt_open_nic+0x3b1/0x750 bnxt_open_nic+0x19/0x30 ethtool_set_channels+0x1ac/0x220 dev_ethtool+0x11ba/0x2240 dev_ioctl+0x1cf/0x390 sock_do_ioctl+0x95/0x130 Reported-by: Rob Sherwood Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Jakub Kicinski Reviewed-by: Michael Chan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -6378,14 +6378,14 @@ static int __bnxt_open_nic(struct bnxt * } } - bnxt_enable_napi(bp); - rc = bnxt_init_nic(bp, irq_re_init); if (rc) { netdev_err(bp->dev, "bnxt_init_nic err: %x\n", rc); - goto open_err; + goto open_err_irq; } + bnxt_enable_napi(bp); + if (link_re_init) { mutex_lock(&bp->link_lock); rc = bnxt_update_phy_setting(bp); @@ -6410,9 +6410,6 @@ static int __bnxt_open_nic(struct bnxt * bnxt_vf_reps_open(bp); return 0; -open_err: - bnxt_disable_napi(bp); - open_err_irq: bnxt_del_napi(bp); From patchwork Fri Sep 11 12:47:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 309807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46D07C433E2 for ; Fri, 11 Sep 2020 16:46:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EE01C2220A for ; Fri, 11 Sep 2020 16:46:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599842790; bh=QLxLHnsqBMRfkyabux2IZ0Ir8juf+/6Ln1bN5JXHj00=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=Ao5Zvw1CMtkbvJVZwJQgQBzhrgxjw5ohW3lpaOouNi9fXJs+g5afe114RhpPqjiVg jysRTKJaatRlb+TeLpNnpmNXzxI56GmQWDkT82ZvVttLfI3oFr/J3Va0sCgoTDYzpq Dc3x2trKhZw/y4xXiChyql8bPd/54E1QjHxKE6Ps= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726369AbgIKQq0 (ORCPT ); Fri, 11 Sep 2020 12:46:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:49230 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725793AbgIKPIE (ORCPT ); Fri, 11 Sep 2020 11:08:04 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0A4F022447; Fri, 11 Sep 2020 12:59:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829160; bh=QLxLHnsqBMRfkyabux2IZ0Ir8juf+/6Ln1bN5JXHj00=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Fjk/v3QzVJF+G3z8jALmIbaaQ27UD/LRSnuvSe4K9bepTCnS0lBvTmwzi2t42+MEd OVwDbBu+fqhhMbEzQ+IqUGJImf5q+yZasWEWaEZeWMIQ5a8q2znmp90Rd98RTUPIn5 Dd9p2pGQ44NJm7L1H1rP8KPA9bTKh5fD6lzM6GeI= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Stephen Smalley , Paul Moore , "David S. Miller" Subject: [PATCH 4.14 08/12] netlabel: fix problems with mapping removal Date: Fri, 11 Sep 2020 14:47:02 +0200 Message-Id: <20200911122458.825825948@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Paul Moore [ Upstream commit d3b990b7f327e2afa98006e7666fb8ada8ed8683 ] This patch fixes two main problems seen when removing NetLabel mappings: memory leaks and potentially extra audit noise. The memory leaks are caused by not properly free'ing the mapping's address selector struct when free'ing the entire entry as well as not properly cleaning up a temporary mapping entry when adding new address selectors to an existing entry. This patch fixes both these problems such that kmemleak reports no NetLabel associated leaks after running the SELinux test suite. The potentially extra audit noise was caused by the auditing code in netlbl_domhsh_remove_entry() being called regardless of the entry's validity. If another thread had already marked the entry as invalid, but not removed/free'd it from the list of mappings, then it was possible that an additional mapping removal audit record would be generated. This patch fixes this by returning early from the removal function when the entry was previously marked invalid. This change also had the side benefit of improving the code by decreasing the indentation level of large chunk of code by one (accounting for most of the diffstat). Fixes: 63c416887437 ("netlabel: Add network address selectors to the NetLabel/LSM domain mapping") Reported-by: Stephen Smalley Signed-off-by: Paul Moore Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/netlabel/netlabel_domainhash.c | 59 ++++++++++++++++++------------------- 1 file changed, 30 insertions(+), 29 deletions(-) --- a/net/netlabel/netlabel_domainhash.c +++ b/net/netlabel/netlabel_domainhash.c @@ -99,6 +99,7 @@ static void netlbl_domhsh_free_entry(str kfree(netlbl_domhsh_addr6_entry(iter6)); } #endif /* IPv6 */ + kfree(ptr->def.addrsel); } kfree(ptr->domain); kfree(ptr); @@ -550,6 +551,8 @@ int netlbl_domhsh_add(struct netlbl_dom_ goto add_return; } #endif /* IPv6 */ + /* cleanup the new entry since we've moved everything over */ + netlbl_domhsh_free_entry(&entry->rcu); } else ret_val = -EINVAL; @@ -593,6 +596,12 @@ int netlbl_domhsh_remove_entry(struct ne { int ret_val = 0; struct audit_buffer *audit_buf; + struct netlbl_af4list *iter4; + struct netlbl_domaddr4_map *map4; +#if IS_ENABLED(CONFIG_IPV6) + struct netlbl_af6list *iter6; + struct netlbl_domaddr6_map *map6; +#endif /* IPv6 */ if (entry == NULL) return -ENOENT; @@ -610,6 +619,9 @@ int netlbl_domhsh_remove_entry(struct ne ret_val = -ENOENT; spin_unlock(&netlbl_domhsh_lock); + if (ret_val) + return ret_val; + audit_buf = netlbl_audit_start_common(AUDIT_MAC_MAP_DEL, audit_info); if (audit_buf != NULL) { audit_log_format(audit_buf, @@ -619,40 +631,29 @@ int netlbl_domhsh_remove_entry(struct ne audit_log_end(audit_buf); } - if (ret_val == 0) { - struct netlbl_af4list *iter4; - struct netlbl_domaddr4_map *map4; -#if IS_ENABLED(CONFIG_IPV6) - struct netlbl_af6list *iter6; - struct netlbl_domaddr6_map *map6; -#endif /* IPv6 */ - - switch (entry->def.type) { - case NETLBL_NLTYPE_ADDRSELECT: - netlbl_af4list_foreach_rcu(iter4, - &entry->def.addrsel->list4) { - map4 = netlbl_domhsh_addr4_entry(iter4); - cipso_v4_doi_putdef(map4->def.cipso); - } + switch (entry->def.type) { + case NETLBL_NLTYPE_ADDRSELECT: + netlbl_af4list_foreach_rcu(iter4, &entry->def.addrsel->list4) { + map4 = netlbl_domhsh_addr4_entry(iter4); + cipso_v4_doi_putdef(map4->def.cipso); + } #if IS_ENABLED(CONFIG_IPV6) - netlbl_af6list_foreach_rcu(iter6, - &entry->def.addrsel->list6) { - map6 = netlbl_domhsh_addr6_entry(iter6); - calipso_doi_putdef(map6->def.calipso); - } + netlbl_af6list_foreach_rcu(iter6, &entry->def.addrsel->list6) { + map6 = netlbl_domhsh_addr6_entry(iter6); + calipso_doi_putdef(map6->def.calipso); + } #endif /* IPv6 */ - break; - case NETLBL_NLTYPE_CIPSOV4: - cipso_v4_doi_putdef(entry->def.cipso); - break; + break; + case NETLBL_NLTYPE_CIPSOV4: + cipso_v4_doi_putdef(entry->def.cipso); + break; #if IS_ENABLED(CONFIG_IPV6) - case NETLBL_NLTYPE_CALIPSO: - calipso_doi_putdef(entry->def.calipso); - break; + case NETLBL_NLTYPE_CALIPSO: + calipso_doi_putdef(entry->def.calipso); + break; #endif /* IPv6 */ - } - call_rcu(&entry->rcu, netlbl_domhsh_free_entry); } + call_rcu(&entry->rcu, netlbl_domhsh_free_entry); return ret_val; } From patchwork Fri Sep 11 12:47:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 309803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1582C433E2 for ; Fri, 11 Sep 2020 16:49:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6BD0A221E7 for ; Fri, 11 Sep 2020 16:49:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599842943; bh=EHkMk8FZTwQ6Tit8Vxcyh+qja6txSu8AoOFPongUJ5E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=WnAbQXVNFjV6/oRlzbtHGRmMfeyyYJ3I7lB3GXJepHgWQ+Zgf6vXn74BzpyttoK7H ljRnNOF7CLi/TBhZSMPg5X6T7CECx3QKtDV0iOO6v0ZaO70Tdk8DmT/63lnfAKz3eE vcvXLwDZcZWsX1skMsq/6Si8h42AArezIEHJ63MI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726271AbgIKQtC (ORCPT ); Fri, 11 Sep 2020 12:49:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:47884 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726352AbgIKPG2 (ORCPT ); Fri, 11 Sep 2020 11:06:28 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5626F2242F; Fri, 11 Sep 2020 12:59:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829162; bh=EHkMk8FZTwQ6Tit8Vxcyh+qja6txSu8AoOFPongUJ5E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T/3vDmqfCX4DEyPIm3oLH/FicVDvw6MQPJvm0rnZfYTAOdx4ec2/r4krQB4HUfaiR IfmTCxX65S16NlBuki4pPM03D0wsp7hgavGxHz61QWsVn1fXjKtZcfelz/ianagyn5 xew6eG90JFZacfS1L+bhAzls1/qpgf5YcWPe7W3I= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Kamil Lorenc , "David S. Miller" Subject: [PATCH 4.14 09/12] net: usb: dm9601: Add USB ID of Keenetic Plus DSL Date: Fri, 11 Sep 2020 14:47:03 +0200 Message-Id: <20200911122458.866898739@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Kamil Lorenc [ Upstream commit a609d0259183a841621f252e067f40f8cc25d6f6 ] Keenetic Plus DSL is a xDSL modem that uses dm9620 as its USB interface. Signed-off-by: Kamil Lorenc Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- drivers/net/usb/dm9601.c | 4 ++++ 1 file changed, 4 insertions(+) --- a/drivers/net/usb/dm9601.c +++ b/drivers/net/usb/dm9601.c @@ -625,6 +625,10 @@ static const struct usb_device_id produc USB_DEVICE(0x0a46, 0x1269), /* DM9621A USB to Fast Ethernet Adapter */ .driver_info = (unsigned long)&dm9601_info, }, + { + USB_DEVICE(0x0586, 0x3427), /* ZyXEL Keenetic Plus DSL xDSL modem */ + .driver_info = (unsigned long)&dm9601_info, + }, {}, // END }; From patchwork Fri Sep 11 12:47:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 264074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99BB1C43461 for ; Fri, 11 Sep 2020 15:21:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D63D221E7 for ; Fri, 11 Sep 2020 15:21:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599837677; bh=8+aKGES8phPY0yHlwA43c9QPgG1sGj0rjTH2xEYGUaA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=CzrGJlTeys66PIWv2s3MHRV5iUqaB7iv7LtSsTAlzoJW6UZNnAeLSE4dajLQWpCYa ofTCK9ZX39q3G82mCBGtny4+SOCQSLrv2RTIjxG2lpEFqECfBWZqBxCHHYGGgcw01e wu2ciV8GBv1UKtXUTaPZxhaDrEoH0LiGcMw5bBCI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726292AbgIKPUt (ORCPT ); Fri, 11 Sep 2020 11:20:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:53392 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726014AbgIKPTH (ORCPT ); Fri, 11 Sep 2020 11:19:07 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 31B0A223E4; Fri, 11 Sep 2020 12:58:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829137; bh=8+aKGES8phPY0yHlwA43c9QPgG1sGj0rjTH2xEYGUaA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i9/4L3maF+HyS/5x1tFwwvucGeq5sxl5Piq8OjbXvQZanfroCstyz4TW+FybVl6LK Og64YkAyDOxx51gKlIzwwvSwoe7xiUZ2O8ytF+tVoWIFDT9Wgwgo4r4J3U00K+pCWk JyzLdGlmFLLcmW4cQ0407L3hlOeph2Bxvu0yNgvA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Ying Xu , Xin Long , Marcelo Ricardo Leitner , "David S. Miller" Subject: [PATCH 4.14 10/12] sctp: not disable bh in the whole sctp_get_port_local() Date: Fri, 11 Sep 2020 14:47:04 +0200 Message-Id: <20200911122458.915150962@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Xin Long [ Upstream commit 3106ecb43a05dc3e009779764b9da245a5d082de ] With disabling bh in the whole sctp_get_port_local(), when snum == 0 and too many ports have been used, the do-while loop will take the cpu for a long time and cause cpu stuck: [ ] watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [ ] RIP: 0010:native_queued_spin_lock_slowpath+0x4de/0x940 [ ] Call Trace: [ ] _raw_spin_lock+0xc1/0xd0 [ ] sctp_get_port_local+0x527/0x650 [sctp] [ ] sctp_do_bind+0x208/0x5e0 [sctp] [ ] sctp_autobind+0x165/0x1e0 [sctp] [ ] sctp_connect_new_asoc+0x355/0x480 [sctp] [ ] __sctp_connect+0x360/0xb10 [sctp] There's no need to disable bh in the whole function of sctp_get_port_local. So fix this cpu stuck by removing local_bh_disable() called at the beginning, and using spin_lock_bh() instead. The same thing was actually done for inet_csk_get_port() in Commit ea8add2b1903 ("tcp/dccp: better use of ephemeral ports in bind()"). Thanks to Marcelo for pointing the buggy code out. v1->v2: - use cond_resched() to yield cpu to other tasks if needed, as Eric noticed. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Ying Xu Signed-off-by: Xin Long Acked-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/sctp/socket.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -7086,8 +7086,6 @@ static long sctp_get_port_local(struct s pr_debug("%s: begins, snum:%d\n", __func__, snum); - local_bh_disable(); - if (snum == 0) { /* Search for an available port. */ int low, high, remaining, index; @@ -7106,20 +7104,21 @@ static long sctp_get_port_local(struct s continue; index = sctp_phashfn(sock_net(sk), rover); head = &sctp_port_hashtable[index]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) if ((pp->port == rover) && net_eq(sock_net(sk), pp->net)) goto next; break; next: - spin_unlock(&head->lock); + spin_unlock_bh(&head->lock); + cond_resched(); } while (--remaining > 0); /* Exhausted local port range during search? */ ret = 1; if (remaining <= 0) - goto fail; + return ret; /* OK, here is the one we will use. HEAD (the port * hash table list entry) is non-NULL and we hold it's @@ -7134,7 +7133,7 @@ static long sctp_get_port_local(struct s * port iterator, pp being NULL. */ head = &sctp_port_hashtable[sctp_phashfn(sock_net(sk), snum)]; - spin_lock(&head->lock); + spin_lock_bh(&head->lock); sctp_for_each_hentry(pp, &head->chain) { if ((pp->port == snum) && net_eq(pp->net, sock_net(sk))) goto pp_found; @@ -7218,10 +7217,7 @@ success: ret = 0; fail_unlock: - spin_unlock(&head->lock); - -fail: - local_bh_enable(); + spin_unlock_bh(&head->lock); return ret; } From patchwork Fri Sep 11 12:47:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 309838 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1D14C43461 for ; Fri, 11 Sep 2020 13:43:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 58DA222273 for ; Fri, 11 Sep 2020 13:43:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599831827; bh=pOqmTT5z1pxK3xTG04T+GA+KPQhbWQMJzXgVzu1idOM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=kwVA8FkhXwUcIJiqgmvrSdezrsWJHiTiXAISAFRL37HnOitvHCldTN3Ds/eeP3G4i vMFnmqhTg2ioJ0o+XwiKJ6rxukuih6bD6ERK6FFJ/KdlGqfQp5W9NIGFmpT07HxxPX 4MrBJvk6N77MiFjC2E2l/ji/2wxpG+3K7Vyy3EQA= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725845AbgIKNnd (ORCPT ); Fri, 11 Sep 2020 09:43:33 -0400 Received: from mail.kernel.org ([198.145.29.99]:36824 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726284AbgIKN2Y (ORCPT ); Fri, 11 Sep 2020 09:28:24 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 9AE0E223E8; Fri, 11 Sep 2020 12:58:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829140; bh=pOqmTT5z1pxK3xTG04T+GA+KPQhbWQMJzXgVzu1idOM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YdZaKN6AhUKv4f6lLUHIOCNqcYm8butgvf7ySdvNwemoRkHE4rJMUwxiW64QMszY3 M6DKh8q604S/LDF8/YCvYdMGIFUbAZPlYU7r2jJbhVSuq8KECvpCe/OOEAAqx7aROB d3/g8axR05r/Ik8sgAqJXi8ZNL3xBbOcGk2EXc6E= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, syzbot , Tetsuo Handa , "David S. Miller" Subject: [PATCH 4.14 11/12] tipc: fix shutdown() of connectionless socket Date: Fri, 11 Sep 2020 14:47:05 +0200 Message-Id: <20200911122458.967159042@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Tetsuo Handa [ Upstream commit 2a63866c8b51a3f72cea388dfac259d0e14c4ba6 ] syzbot is reporting hung task at nbd_ioctl() [1], for there are two problems regarding TIPC's connectionless socket's shutdown() operation. ---------- #include #include #include #include #include int main(int argc, char *argv[]) { const int fd = open("/dev/nbd0", 3); alarm(5); ioctl(fd, NBD_SET_SOCK, socket(PF_TIPC, SOCK_DGRAM, 0)); ioctl(fd, NBD_DO_IT, 0); /* To be interrupted by SIGALRM. */ return 0; } ---------- One problem is that wait_for_completion() from flush_workqueue() from nbd_start_device_ioctl() from nbd_ioctl() cannot be completed when nbd_start_device_ioctl() received a signal at wait_event_interruptible(), for tipc_shutdown() from kernel_sock_shutdown(SHUT_RDWR) from nbd_mark_nsock_dead() from sock_shutdown() from nbd_start_device_ioctl() is failing to wake up a WQ thread sleeping at wait_woken() from tipc_wait_for_rcvmsg() from sock_recvmsg() from sock_xmit() from nbd_read_stat() from recv_work() scheduled by nbd_start_device() from nbd_start_device_ioctl(). Fix this problem by always invoking sk->sk_state_change() (like inet_shutdown() does) when tipc_shutdown() is called. The other problem is that tipc_wait_for_rcvmsg() cannot return when tipc_shutdown() is called, for tipc_shutdown() sets sk->sk_shutdown to SEND_SHUTDOWN (despite "how" is SHUT_RDWR) while tipc_wait_for_rcvmsg() needs sk->sk_shutdown set to RCV_SHUTDOWN or SHUTDOWN_MASK. Fix this problem by setting sk->sk_shutdown to SHUTDOWN_MASK (like inet_shutdown() does) when the socket is connectionless. [1] https://syzkaller.appspot.com/bug?id=3fe51d307c1f0a845485cf1798aa059d12bf18b2 Reported-by: syzbot Signed-off-by: Tetsuo Handa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/tipc/socket.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -2126,18 +2126,21 @@ static int tipc_shutdown(struct socket * lock_sock(sk); __tipc_shutdown(sock, TIPC_CONN_SHUTDOWN); - sk->sk_shutdown = SEND_SHUTDOWN; + if (tipc_sk_type_connectionless(sk)) + sk->sk_shutdown = SHUTDOWN_MASK; + else + sk->sk_shutdown = SEND_SHUTDOWN; if (sk->sk_state == TIPC_DISCONNECTING) { /* Discard any unreceived messages */ __skb_queue_purge(&sk->sk_receive_queue); - /* Wake up anyone sleeping in poll */ - sk->sk_state_change(sk); res = 0; } else { res = -ENOTCONN; } + /* Wake up anyone sleeping in poll. */ + sk->sk_state_change(sk); release_sock(sk); return res; From patchwork Fri Sep 11 12:47:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 309834 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A689DC433E2 for ; Fri, 11 Sep 2020 14:02:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 589FE21D81 for ; Fri, 11 Sep 2020 14:02:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599832978; bh=UAs3Q/Zz9F5+9bZ42uUtir1voEsU5TfKbTq5cMt4HEs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=xqV1I1PxYd1il1MljlGY26XQcaIQOcDAbYE9xh9/V7cFW7Vt3oiqMcKpMdB/PiohH pm/rc7HWIuL+GIch1zZhcuSWOvouNQCEv2pIDOWnG+y5T8imh+xhHYir6RyB6sv7EE LR/kn7WuX39kcJz7WRWlVgOJ3PpYZ32PfOeuxUBY= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726310AbgIKOCo (ORCPT ); Fri, 11 Sep 2020 10:02:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:32772 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726108AbgIKNVa (ORCPT ); Fri, 11 Sep 2020 09:21:30 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 139CA223EA; Fri, 11 Sep 2020 12:59:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599829142; bh=UAs3Q/Zz9F5+9bZ42uUtir1voEsU5TfKbTq5cMt4HEs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tZz12LH5l7veTMDSJjDe3roefZRkQpr+qBcPF+3wWb4Ao4pZRsFdm3WS9oW7nypoW wiJpJU2wE/68XTJivVsTOz+jDtLMEGsdp6/x44JdgGquTKpLqncZu2Fh3WSTBas3bK 1TnE8+Fu0fAtoVKl/e1SeKXK7dqMvAjJlsxUrKyw= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Rob Sherwood , Jakub Kicinski , "David S. Miller" Subject: [PATCH 4.14 12/12] net: disable netpoll on fresh napis Date: Fri, 11 Sep 2020 14:47:06 +0200 Message-Id: <20200911122459.015758062@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911122458.413137406@linuxfoundation.org> References: <20200911122458.413137406@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Jakub Kicinski [ Upstream commit 96e97bc07e90f175a8980a22827faf702ca4cb30 ] napi_disable() makes sure to set the NAPI_STATE_NPSVC bit to prevent netpoll from accessing rings before init is complete. However, the same is not done for fresh napi instances in netif_napi_add(), even though we expect NAPI instances to be added as disabled. This causes crashes during driver reconfiguration (enabling XDP, changing the channel count) - if there is any printk() after netif_napi_add() but before napi_enable(). To ensure memory ordering is correct we need to use RCU accessors. Reported-by: Rob Sherwood Fixes: 2d8bff12699a ("netpoll: Close race condition between poll_one_napi and napi_disable") Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/core/dev.c | 3 ++- net/core/netpoll.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5532,12 +5532,13 @@ void netif_napi_add(struct net_device *d pr_err_once("netif_napi_add() called with weight %d on device %s\n", weight, dev->name); napi->weight = weight; - list_add(&napi->dev_list, &dev->napi_list); napi->dev = dev; #ifdef CONFIG_NETPOLL napi->poll_owner = -1; #endif set_bit(NAPI_STATE_SCHED, &napi->state); + set_bit(NAPI_STATE_NPSVC, &napi->state); + list_add_rcu(&napi->dev_list, &dev->napi_list); napi_hash_add(napi); } EXPORT_SYMBOL(netif_napi_add); --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -179,7 +179,7 @@ static void poll_napi(struct net_device struct napi_struct *napi; int cpu = smp_processor_id(); - list_for_each_entry(napi, &dev->napi_list, dev_list) { + list_for_each_entry_rcu(napi, &dev->napi_list, dev_list) { if (cmpxchg(&napi->poll_owner, -1, cpu) == -1) { poll_one_napi(napi); smp_store_release(&napi->poll_owner, -1);