From patchwork Thu Oct 3 00:51:31 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 20752 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ve0-f199.google.com (mail-ve0-f199.google.com [209.85.128.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 4819F238F9 for ; Thu, 3 Oct 2013 00:52:05 +0000 (UTC) Received: by mail-ve0-f199.google.com with SMTP id db12sf3844540veb.6 for ; Wed, 02 Oct 2013 17:52:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:x-gm-message-state:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=XIM7saoOs5TmH1II3tpHUGfTuOHzbhy8ZD5FKtAXg9k=; b=UdPDLJD0DnXBbZvmmMIjQPpbafnHUqAN0sqLHOH1GOkPKhdjsOrj2dZhv04z7ZDVki fbmu/NE6Kbnwc7TX6+WvSVxzujqJ7VjXpSxyl5OYW9C+wsPUeT/w89c7FyZTxpuP2NI/ JkNTBZlh2LnPm5el8kS8D0zhl2jLLJEamkeHCDuzGsAXRJoSW+wPb9OHPAIt4yCQt7hX lKuXZS498XAcuhwBM+OPH+rg6pPFX75Yf4z7Ej9+qX+w/kS7bbzmcqug6OI3ESFSjEgj cXPeSuACbf1WgMn7V8V5zf+zqdDP6+oECsQqDpEjewfN16Ku2xO+3G07o9JrxGNVbe3c VFCA== X-Received: by 10.236.90.67 with SMTP id d43mr5173362yhf.36.1380761525048; Wed, 02 Oct 2013 17:52:05 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.49.6.73 with SMTP id y9ls770174qey.60.gmail; Wed, 02 Oct 2013 17:52:04 -0700 (PDT) X-Received: by 10.58.235.193 with SMTP id uo1mr4629833vec.6.1380761524854; Wed, 02 Oct 2013 17:52:04 -0700 (PDT) Received: from mail-ve0-f169.google.com (mail-ve0-f169.google.com [209.85.128.169]) by mx.google.com with ESMTPS id wp10si1030931vdb.149.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 02 Oct 2013 17:52:04 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.128.169 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.169; Received: by mail-ve0-f169.google.com with SMTP id db12so1341342veb.28 for ; Wed, 02 Oct 2013 17:52:04 -0700 (PDT) X-Gm-Message-State: ALoCoQmoS35K7H60JCWtNZJfQaGc/mFYvoI5T+XARgtHBvl8CH97JinKkdN30kn+9tWxtJK3f0gN X-Received: by 10.58.168.205 with SMTP id zy13mr4677965veb.19.1380761524736; Wed, 02 Oct 2013 17:52:04 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.174.196 with SMTP id u4csp137506vcz; Wed, 2 Oct 2013 17:52:04 -0700 (PDT) X-Received: by 10.66.182.36 with SMTP id eb4mr6017153pac.125.1380761523775; Wed, 02 Oct 2013 17:52:03 -0700 (PDT) Received: from mail-pd0-f176.google.com (mail-pd0-f176.google.com [209.85.192.176]) by mx.google.com with ESMTPS id mj9si3890841pab.306.1969.12.31.16.00.00 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 02 Oct 2013 17:52:03 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.192.176 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=209.85.192.176; Received: by mail-pd0-f176.google.com with SMTP id q10so1667265pdj.21 for ; Wed, 02 Oct 2013 17:52:03 -0700 (PDT) X-Received: by 10.68.219.104 with SMTP id pn8mr5427498pbc.81.1380761523334; Wed, 02 Oct 2013 17:52:03 -0700 (PDT) Received: from localhost.localdomain (c-67-170-153-23.hsd1.or.comcast.net. [67.170.153.23]) by mx.google.com with ESMTPSA id gh2sm4507018pbc.40.1969.12.31.16.00.00 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 02 Oct 2013 17:52:02 -0700 (PDT) From: John Stultz To: LKML Cc: Minchan Kim , Andrew Morton , Android Kernel Team , Robert Love , Mel Gorman , Hugh Dickins , Dave Hansen , Rik van Riel , Dmitry Adamushko , Dave Chinner , Neil Brown , Andrea Righi , Andrea Arcangeli , "Aneesh Kumar K.V" , Mike Hommey , Taras Glek , Dhaval Giani , Jan Kara , KOSAKI Motohiro , Michel Lespinasse , Rob Clark , "linux-mm@kvack.org" , John Stultz Subject: [PATCH 02/14] vrange: Add vrange support to mm_structs Date: Wed, 2 Oct 2013 17:51:31 -0700 Message-Id: <1380761503-14509-3-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1380761503-14509-1-git-send-email-john.stultz@linaro.org> References: <1380761503-14509-1-git-send-email-john.stultz@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.169 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , From: Minchan Kim This patch addes vroot on mm_struct so process can set volatile ranges on anonymous memory. This is somewhat wasteful, as it increases the mm struct even if the process doesn't use vrange syscall. So a later patch will provide dynamically allocated vroots. One of note on this patch is vrange_fork. Since we do allocations while holding a lock on the vrange, its possible it could deadlock with direct reclaim's purging logic. For this reason, vrange_fork uses GFP_NOIO for its allocations. If vrange_fork fails, it isn't a critical problem. Since the result is the child process's pages won't be volatile/purgable, which could cause additional memory pressure, but won't cause problematic application behavior (since volatile pages are only purged at the kernels' discretion). This is thought to be more desirable then having fork fail. Cc: Andrew Morton Cc: Android Kernel Team Cc: Robert Love Cc: Mel Gorman Cc: Hugh Dickins Cc: Dave Hansen Cc: Rik van Riel Cc: Dmitry Adamushko Cc: Dave Chinner Cc: Neil Brown Cc: Andrea Righi Cc: Andrea Arcangeli Cc: Aneesh Kumar K.V Cc: Mike Hommey Cc: Taras Glek Cc: Dhaval Giani Cc: Jan Kara Cc: KOSAKI Motohiro Cc: Michel Lespinasse Cc: Rob Clark Cc: Minchan Kim Cc: linux-mm@kvack.org Signed-off-by: Minchan Kim [jstultz: Bit of refactoring. Comment cleanups] Signed-off-by: John Stultz --- include/linux/mm_types.h | 4 ++++ include/linux/vrange.h | 7 ++++++- kernel/fork.c | 11 +++++++++++ mm/vrange.c | 40 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 61 insertions(+), 1 deletion(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index faf4b7c..5d8cdc3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -349,6 +350,9 @@ struct mm_struct { */ +#ifdef CONFIG_MMU + struct vrange_root vroot; +#endif unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ diff --git a/include/linux/vrange.h b/include/linux/vrange.h index 0d378a5..2b96ee1 100644 --- a/include/linux/vrange.h +++ b/include/linux/vrange.h @@ -37,12 +37,17 @@ static inline int vrange_type(struct vrange *vrange) } extern void vrange_root_cleanup(struct vrange_root *vroot); - +extern int vrange_fork(struct mm_struct *new, + struct mm_struct *old); #else static inline void vrange_root_init(struct vrange_root *vroot, int type, void *object) {}; static inline void vrange_root_cleanup(struct vrange_root *vroot) {}; +static inline int vrange_fork(struct mm_struct *new, struct mm_struct *old) +{ + return 0; +} #endif #endif /* _LINIUX_VRANGE_H */ diff --git a/kernel/fork.c b/kernel/fork.c index bf46287..ceb38bf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -71,6 +71,7 @@ #include #include #include +#include #include #include @@ -377,6 +378,14 @@ static int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) retval = khugepaged_fork(mm, oldmm); if (retval) goto out; + /* + * Note: vrange_fork can fail in the case of ENOMEM, but + * this only results in the child not having any active + * volatile ranges. This is not harmful. Thus in this case + * the child will not see any pages purged unless it remarks + * them as volatile. + */ + vrange_fork(mm, oldmm); prev = NULL; for (mpnt = oldmm->mmap; mpnt; mpnt = mpnt->vm_next) { @@ -538,6 +547,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p) mm->nr_ptes = 0; memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); spin_lock_init(&mm->page_table_lock); + vrange_root_init(&mm->vroot, VRANGE_MM, mm); mm_init_aio(mm); mm_init_owner(mm, p); @@ -609,6 +619,7 @@ void mmput(struct mm_struct *mm) if (atomic_dec_and_test(&mm->mm_users)) { uprobe_clear_state(mm); + vrange_root_cleanup(&mm->vroot); exit_aio(mm); ksm_exit(mm); khugepaged_exit(mm); /* must run before exit_mmap */ diff --git a/mm/vrange.c b/mm/vrange.c index 866566c..4ddcc3e9 100644 --- a/mm/vrange.c +++ b/mm/vrange.c @@ -181,3 +181,43 @@ void vrange_root_cleanup(struct vrange_root *vroot) vrange_unlock(vroot); } +/* + * It's okay to fail vrange_fork because worst case is child process + * can't have copied own vrange data structure so that pages in the + * vrange couldn't be purged. It would be better rather than failing + * fork. + */ +int vrange_fork(struct mm_struct *new_mm, struct mm_struct *old_mm) +{ + struct vrange_root *new, *old; + struct vrange *range, *new_range; + struct rb_node *next; + + new = &new_mm->vroot; + old = &old_mm->vroot; + + vrange_lock(old); + next = rb_first(&old->v_rb); + while (next) { + range = vrange_entry(next); + next = rb_next(next); + /* + * We can't use GFP_KERNEL because direct reclaim's + * purging logic on vrange could be deadlock by + * vrange_lock. + */ + new_range = __vrange_alloc(GFP_NOIO); + if (!new_range) + goto fail; + __vrange_set(new_range, range->node.start, + range->node.last, range->purged); + __vrange_add(new_range, new); + + } + vrange_unlock(old); + return 0; +fail: + vrange_unlock(old); + vrange_root_cleanup(new); + return -ENOMEM; +}