From patchwork Wed Nov 25 01:10:07 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jin Qian X-Patchwork-Id: 57284 Delivered-To: patch@linaro.org Received: by 10.112.155.196 with SMTP id vy4csp2444789lbb; Tue, 24 Nov 2015 17:10:49 -0800 (PST) X-Received: by 10.98.15.76 with SMTP id x73mr26719361pfi.81.1448413849440; Tue, 24 Nov 2015 17:10:49 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ua10si30030670pab.236.2015.11.24.17.10.49; Tue, 24 Nov 2015 17:10:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dkim=neutral (body hash did not verify) header.i=@android.com; dmarc=fail (p=NONE dis=NONE) header.from=android.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753698AbbKYBKr (ORCPT + 28 others); Tue, 24 Nov 2015 20:10:47 -0500 Received: from mail-pa0-f46.google.com ([209.85.220.46]:33482 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755256AbbKYBKk (ORCPT ); Tue, 24 Nov 2015 20:10:40 -0500 Received: by pabfh17 with SMTP id fh17so40256196pab.0 for ; Tue, 24 Nov 2015 17:10:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=android.com; s=20120917; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0eZpFqCTa3h0/l8okcGwm5fYUdMER47FWVvtUyg1AxM=; b=M49FNsjikwIklBZ0TldG9vfENA/4FerTWSN2+Jxj3efsCRiZK9JTTbIAwJk/sgT/gB R7QhaqdyacGfFNlaGbZXyf2BzGVbW3Lax7S/EsS8Ylf+jiEJtxsxRjTd2Nazqfy1sZCs z67KfZq+O/8TIs/pnaiAESbI/j2u+/NX53kLmgReqhwjRXh86mJQQtPWjwjDQjVgXrC0 bcJaedtZ6/XL59/DL/m7a3kyXZyphZgYS5uYTXbhKlGjhaVpPUm9wwZA6ZexN8o3J1v8 KZ5kS0GsJZVcxjzPQIpjoZZO8iVadabMZ7frwn1GEeQ0Bon1WODB/R9diduSI5CWA/Ro eEJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0eZpFqCTa3h0/l8okcGwm5fYUdMER47FWVvtUyg1AxM=; b=E+XOKFIgA9eExDigY0dPDTGc56DaC0pYeSkTju+HQ1ImpPctWOqr/wBqB7Nuxn2SLy CmjBVo+zSuS/VIibJ1Yu2hTHt8b6/S60o5EUYjcvW+dhNIRuKlwPnJ+2Lr1Qwj/TyA/+ 0soNOE8NyR32QMLI10a8hv3RQpaTZCYtGXxvEyJyj7lRMO8ZsTDJ+zVul2dVzShpGm8Z jZc0EH6XlrE44r+yi5eWHVTX4BfZm+WjFrDkDyvUxPjw0pUfi5hUNnKjFpWTxej2Upqb JtYMw0LDq9URS05ZffEBKDUxBCOQrg0UQBCeVerV1MO2260HKAKt88xLHYF+EHMk4Bbs 6oRQ== X-Gm-Message-State: ALoCoQn2q1oG3vDBJjxhRZpgTYtnmZgMejWEgym0yzlt3QKhPBMH2hy/0dw/sMcn+cKwyYn+8s0k X-Received: by 10.98.89.28 with SMTP id n28mr27406766pfb.32.1448413839342; Tue, 24 Nov 2015 17:10:39 -0800 (PST) Received: from jinqian.mtv.corp.google.com ([172.22.126.125]) by smtp.gmail.com with ESMTPSA id jj5sm17357386pac.4.2015.11.24.17.10.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 24 Nov 2015 17:10:37 -0800 (PST) From: Jin Qian To: Greg Hackmann , Greg Kroah-Hartman , Jason Hu , Joe Perches , Yu Ning , Peter Senna Tschudin , Christoffer Dall , =?UTF-8?q?Alex=20Benn=C3=A9e?= , linux-kernel@vger.kernel.org Cc: Jin Qian Subject: [PATCH 3/8] android_pipe: Pin pages to memory while copying and other cleanups Date: Tue, 24 Nov 2015 17:10:07 -0800 Message-Id: <1448413812-24289-4-git-send-email-jinqian@android.com> X-Mailer: git-send-email 2.6.0.rc2.230.g3dd15c0 In-Reply-To: <1448413812-24289-1-git-send-email-jinqian@android.com> References: <1448413812-24289-1-git-send-email-jinqian@android.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Christoffer Dall The existing code had a troubling TODO statement concerning the fact that it just did a check if the page that the QEMU backend was going to read from / write to was there before the call to the QEMU backend and then relying on the fact that the page stayed around, even in a preemptible SMP kernel. Obviously the page could go away or be reassigned, and strange things may happen. Further, writes were not tracked, so any use of COW or KSM-like features would break completely. Probably that was never used by adbd (the only current active user of the pipe), but could prove much more dangerous for the GPU passthrough mechanism. Instead, use get_user_pages() as the comment suggested and cleanup the error path and add the set_page_dirt() call on a successful read operation. Also clarify the count used to return from successful read/write calls and use Linux style commentary in various places of the file. Note: The "just ignore error and return whatever we read so far" error handling is really quite horrific. I cannot change it without a more careful study of all user space ABIs reliance on this 'feature'. Signed-off-by: Christoffer Dall (cherry picked from commit ca8dafc623c6956fad77df80540a1e31f5f8fe2b) Signed-off-by: Jin Qian --- drivers/platform/goldfish/goldfish_pipe.c | 129 +++++++++++++++++------------- 1 file changed, 72 insertions(+), 57 deletions(-) -- 2.6.0.rc2.230.g3dd15c0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ diff --git a/drivers/platform/goldfish/goldfish_pipe.c b/drivers/platform/goldfish/goldfish_pipe.c index 0fb3a34..20a9337 100644 --- a/drivers/platform/goldfish/goldfish_pipe.c +++ b/drivers/platform/goldfish/goldfish_pipe.c @@ -2,6 +2,7 @@ * Copyright (C) 2011 Google, Inc. * Copyright (C) 2012 Intel, Inc. * Copyright (C) 2013 Intel, Inc. + * Copyright (C) 2014 Linaro Limited * * This software is licensed under the terms of the GNU General Public * License version 2, as published by the Free Software Foundation, and @@ -57,6 +58,7 @@ #include #include #include +#include /* * IMPORTANT: The following constants must match the ones used and defined @@ -257,17 +259,14 @@ static int access_with_param(struct goldfish_pipe_dev *dev, const int cmd, return 0; } -/* This function is used for both reading from and writing to a given - * pipe. - */ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, - size_t bufflen, int is_write) + size_t bufflen, int is_write) { unsigned long irq_flags; struct goldfish_pipe *pipe = filp->private_data; struct goldfish_pipe_dev *dev = pipe->dev; unsigned long address, address_end; - int ret = 0; + int count = 0, ret = -EINVAL; /* If the emulator already closed the pipe, no need to go further */ if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags)) @@ -290,30 +289,23 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, address_end = address + bufflen; while (address < address_end) { - unsigned long page_end = (address & PAGE_MASK) + PAGE_SIZE; - unsigned long next = page_end < address_end ? page_end - : address_end; - unsigned long avail = next - address; + unsigned long page_end = (address & PAGE_MASK) + PAGE_SIZE; + unsigned long next = page_end < address_end ? page_end + : address_end; + unsigned long avail = next - address; int status, wakeBit; - - /* Ensure that the corresponding page is properly mapped */ - /* FIXME: this isn't safe or sufficient - use get_user_pages */ - if (is_write) { - char c; - /* Ensure that the page is mapped and readable */ - if (__get_user(c, (char __user *)address)) { - if (!ret) - ret = -EFAULT; - break; - } - } else { - /* Ensure that the page is mapped and writable */ - if (__put_user(0, (char __user *)address)) { - if (!ret) - ret = -EFAULT; - break; - } - } + struct page *page; + + /* + * We grab the pages on a page-by-page basis in case user + * space gives us a potentially huge buffer but the read only + * returns a small amount, then there's no need to pin that + * much memory to the process. + */ + ret = get_user_pages(current, current->mm, address, 1, + !is_write, 0, &page, NULL); + if (ret < 0) + return ret; /* Now, try to transfer the bytes in the current page */ spin_lock_irqsave(&dev->lock, irq_flags); @@ -332,33 +324,48 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, } spin_unlock_irqrestore(&dev->lock, irq_flags); + if (status > 0 && !is_write) + set_page_dirty(page); + put_page(page); + if (status > 0) { /* Correct transfer */ - ret += status; + count += status; address += status; continue; - } - - if (status == 0) /* EOF */ + } else if (status == 0) { /* EOF */ + ret = 0; break; - - /* An error occured. If we already transfered stuff, just - * return with its count. We expect the next call to return - * an error code */ - if (ret > 0) + } else if (status < 0 && count > 0) { + /* + * An error occurred and we already transferred + * something on one of the previous pages. + * Just return what we already copied and log this + * err. + * + * Note: This seems like an incorrect approach but + * cannot change it until we check if any user space + * ABI relies on this behavior. + */ + pr_info_ratelimited("android_pipe: backend returned error %d on %s\n", + status, is_write ? "write" : "read"); + ret = 0; break; + } - /* If the error is not PIPE_ERROR_AGAIN, or if we are not in - * non-blocking mode, just return the error code. - */ + /* + * If the error is not PIPE_ERROR_AGAIN, or if we are not in + * non-blocking mode, just return the error code. + */ if (status != PIPE_ERROR_AGAIN || (filp->f_flags & O_NONBLOCK) != 0) { ret = goldfish_pipe_error_convert(status); break; } - /* We will have to wait until more data/space is available. - * First, mark the pipe as waiting for a specific wake signal. - */ + /* + * The backend blocked the read/write, wait until the backend + * tells us it's ready to process more data. + */ wakeBit = is_write ? BIT_WAKE_ON_WRITE : BIT_WAKE_ON_READ; set_bit(wakeBit, &pipe->flags); @@ -372,22 +379,29 @@ static ssize_t goldfish_pipe_read_write(struct file *filp, char __user *buffer, while (test_bit(wakeBit, &pipe->flags)) { if (wait_event_interruptible( pipe->wake_queue, - !test_bit(wakeBit, &pipe->flags))) - return -ERESTARTSYS; + !test_bit(wakeBit, &pipe->flags))) { + ret = -ERESTARTSYS; + break; + } - if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags)) - return -EIO; + if (test_bit(BIT_CLOSED_ON_HOST, &pipe->flags)) { + ret = -EIO; + break; + } } /* Try to re-acquire the lock */ - if (mutex_lock_interruptible(&pipe->lock)) - return -ERESTARTSYS; - - /* Try the transfer again */ - continue; + if (mutex_lock_interruptible(&pipe->lock)) { + ret = -ERESTARTSYS; + break; + } } mutex_unlock(&pipe->lock); - return ret; + + if (ret < 0) + return ret; + else + return count; } static ssize_t goldfish_pipe_read(struct file *filp, char __user *buffer, @@ -440,10 +454,11 @@ static irqreturn_t goldfish_pipe_interrupt(int irq, void *dev_id) unsigned long irq_flags; int count = 0; - /* We're going to read from the emulator a list of (channel,flags) - * pairs corresponding to the wake events that occured on each - * blocked pipe (i.e. channel). - */ + /* + * We're going to read from the emulator a list of (channel,flags) + * pairs corresponding to the wake events that occurred on each + * blocked pipe (i.e. channel). + */ spin_lock_irqsave(&dev->lock, irq_flags); for (;;) { /* First read the channel, 0 means the end of the list */