From patchwork Wed Apr 3 23:52:22 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 15888 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 6F39323E39 for ; Wed, 3 Apr 2013 23:52:47 +0000 (UTC) Received: from mail-qe0-f69.google.com (mail-qe0-f69.google.com [209.85.128.69]) by fiordland.canonical.com (Postfix) with ESMTP id 102A2A18B06 for ; Wed, 3 Apr 2013 23:52:46 +0000 (UTC) Received: by mail-qe0-f69.google.com with SMTP id 2sf1003445qea.4 for ; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:x-beenthere:x-received:received-spf :x-received:x-forwarded-to:x-forwarded-for:delivered-to:x-received :received-spf:x-received:from:to:cc:subject:date:message-id:x-mailer :in-reply-to:references:x-gm-message-state:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :x-google-group-id:list-post:list-help:list-archive:list-unsubscribe; bh=D9CpjTEuzL9T5QfcKCAYeglOuT0iZCiVFHzhTfFBj9I=; b=OUSMmSFs8cmkfC5QrsEYfAjpWrfXNlY9AJfvS+1tiWjn6DtNF/Rc3c5hr8qGUVuWdi KIUJZQiJp0srv51asL7lttKe5MzV+3QDhzkKymmVTW6V8ZdiB7Ysn5aBbcTDGM1vGiXq EGb4x8ue9iPgmYK/pDM4AG9iOSfdMQ4qvSfKmlBzDd5rdDBKr9KuDhX4Ip6LAjkrP3Zz wErSNlDp1H6gICWZWHHdzE52G9P33lqwo5DJsmbwUP30spUHJN9HbGEuXy88wP//MQwo D0njC/Kb1MZOoAAR35jLQcs8ZnYDi5VhUhMniXANEQc1KIXhzdmnHJ3jG/FYiW2Tf9RK KDbQ== X-Received: by 10.224.157.1 with SMTP id z1mr2096649qaw.8.1365033166568; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.49.18.42 with SMTP id t10ls1129923qed.35.gmail; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) X-Received: by 10.58.220.229 with SMTP id pz5mr3034283vec.30.1365033166297; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) Received: from mail-ve0-f179.google.com (mail-ve0-f179.google.com [209.85.128.179]) by mx.google.com with ESMTPS id do1si6623630vdb.30.2013.04.03.16.52.46 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 03 Apr 2013 16:52:46 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.128.179 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.128.179; Received: by mail-ve0-f179.google.com with SMTP id cz11so2218296veb.10 for ; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) X-Received: by 10.52.233.225 with SMTP id tz1mr2646604vdc.54.1365033166047; Wed, 03 Apr 2013 16:52:46 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.59.4.204 with SMTP id cg12csp191419ved; Wed, 3 Apr 2013 16:52:45 -0700 (PDT) X-Received: by 10.68.176.68 with SMTP id cg4mr5607641pbc.49.1365033165012; Wed, 03 Apr 2013 16:52:45 -0700 (PDT) Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by mx.google.com with ESMTPS id cm3si8948217pad.98.2013.04.03.16.52.44 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 03 Apr 2013 16:52:45 -0700 (PDT) Received-SPF: neutral (google.com: 209.85.220.44 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=209.85.220.44; Received: by mail-pa0-f44.google.com with SMTP id bi5so1168137pad.31 for ; Wed, 03 Apr 2013 16:52:44 -0700 (PDT) X-Received: by 10.66.50.2 with SMTP id y2mr6121954pan.179.1365033164558; Wed, 03 Apr 2013 16:52:44 -0700 (PDT) Received: from localhost.localdomain (c-24-21-54-107.hsd1.or.comcast.net. [24.21.54.107]) by mx.google.com with ESMTPS id ql7sm7752384pbb.2.2013.04.03.16.52.42 (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 03 Apr 2013 16:52:43 -0700 (PDT) From: John Stultz To: linux-kernel@vger.kernel.org Cc: John Stultz , linux-mm@kvack.org, Michael Kerrisk , Arun Sharma , Mel Gorman , Hugh Dickins , Dave Hansen , Rik van Riel , Neil Brown , Mike Hommey , Taras Glek , KOSAKI Motohiro , KAMEZAWA Hiroyuki , Jason Evans , sanjay@google.com, Paul Turner , Johannes Weiner , Michel Lespinasse , Andrew Morton , Minchan Kim Subject: [RFC PATCH 3/4] vrange: Support fvrange() syscall for file based volatile ranges Date: Wed, 3 Apr 2013 16:52:22 -0700 Message-Id: <1365033144-15156-4-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1365033144-15156-1-git-send-email-john.stultz@linaro.org> References: <1365033144-15156-1-git-send-email-john.stultz@linaro.org> X-Gm-Message-State: ALoCoQmIqQw7Ftyy/iQOHsMrRncFSrtN3n2+Y9M8as1Sdi1owujDwviEwTrA6Jjr6tZXFSfK57Rl X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.128.179 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Add vrange support on addres_space structures, and add fvrange() syscall for creating ranges on address_space structures. Cc: linux-mm@kvack.org Cc: Michael Kerrisk Cc: Arun Sharma Cc: Mel Gorman Cc: Hugh Dickins Cc: Dave Hansen Cc: Rik van Riel Cc: Neil Brown Cc: Mike Hommey Cc: Taras Glek Cc: KOSAKI Motohiro Cc: KAMEZAWA Hiroyuki Cc: Jason Evans Cc: sanjay@google.com Cc: Paul Turner Cc: Johannes Weiner Cc: Michel Lespinasse Cc: Andrew Morton Cc: Minchan Kim Signed-off-by: John Stultz --- arch/x86/syscalls/syscall_64.tbl | 1 + fs/file_table.c | 5 +++ fs/inode.c | 2 ++ include/linux/fs.h | 2 ++ include/linux/vrange.h | 19 +++++++++- include/linux/vrange_types.h | 1 + mm/vrange.c | 72 +++++++++++++++++++++++++++++++++++++- 7 files changed, 100 insertions(+), 2 deletions(-) diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl index dc332bd..910d9f3 100644 --- a/arch/x86/syscalls/syscall_64.tbl +++ b/arch/x86/syscalls/syscall_64.tbl @@ -321,6 +321,7 @@ 312 common kcmp sys_kcmp 313 common finit_module sys_finit_module 314 common vrange sys_vrange +315 common fvrange sys_fvrange # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/file_table.c b/fs/file_table.c index cd4d87a..61c8aaa 100644 --- a/fs/file_table.c +++ b/fs/file_table.c @@ -26,6 +26,7 @@ #include #include #include +#include #include @@ -244,6 +245,10 @@ static void __fput(struct file *file) file->f_op->fasync(-1, file, 0); } ima_file_free(file); + + /* drop all vranges on last close */ + mapping_exit_vrange(inode->i_mapping); + if (file->f_op && file->f_op->release) file->f_op->release(inode, file); security_file_free(file); diff --git a/fs/inode.c b/fs/inode.c index f5f7c06..4707c95 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -17,6 +17,7 @@ #include #include /* for inode_has_buffers */ #include +#include #include "internal.h" /* @@ -350,6 +351,7 @@ void address_space_init_once(struct address_space *mapping) spin_lock_init(&mapping->private_lock); mapping->i_mmap = RB_ROOT; INIT_LIST_HEAD(&mapping->i_mmap_nonlinear); + mapping_init_vrange(mapping); } EXPORT_SYMBOL(address_space_init_once); diff --git a/include/linux/fs.h b/include/linux/fs.h index 2c28271..6f86c7c 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -411,6 +412,7 @@ struct address_space { struct rb_root i_mmap; /* tree of private and shared mappings */ struct list_head i_mmap_nonlinear;/*list VM_NONLINEAR mappings */ struct mutex i_mmap_mutex; /* protect tree, count, list */ + struct vrange_root vroot; /* Protected by tree_lock together with the radix tree */ unsigned long nrpages; /* number of total pages */ pgoff_t writeback_index;/* writeback starts here */ diff --git a/include/linux/vrange.h b/include/linux/vrange.h index b9b219c..91960eb 100644 --- a/include/linux/vrange.h +++ b/include/linux/vrange.h @@ -3,6 +3,7 @@ #include #include +#include #define vrange_entry(ptr) \ container_of(ptr, struct vrange, node.rb) @@ -11,10 +12,19 @@ static inline void mm_init_vrange(struct mm_struct *mm) { + mm->vroot.type = VRANGE_ANON; mm->vroot.v_rb = RB_ROOT; mutex_init(&mm->vroot.v_lock); } +static inline void mapping_init_vrange(struct address_space *mapping) +{ + mapping->vroot.type = VRANGE_FILE; + mapping->vroot.v_rb = RB_ROOT; + mutex_init(&mapping->vroot.v_lock); +} + + static inline void vrange_lock(struct vrange_root *vroot) { mutex_lock(&vroot->v_lock); @@ -25,15 +35,22 @@ static inline void vrange_unlock(struct vrange_root *vroot) mutex_unlock(&vroot->v_lock); } -static inline struct mm_struct *vrange_get_owner_mm(struct vrange *vrange) +static inline int vrange_type(struct vrange *vrange) { + return vrange->owner->type; +} +static inline struct mm_struct *vrange_get_owner_mm(struct vrange *vrange) +{ + if (vrange_type(vrange) != VRANGE_ANON) + return NULL; return container_of(vrange->owner, struct mm_struct, vroot); } void vrange_init(void); extern void mm_exit_vrange(struct mm_struct *mm); +extern void mapping_exit_vrange(struct address_space *mapping); int discard_vpage(struct page *page); bool vrange_address(struct mm_struct *mm, unsigned long start, unsigned long end); diff --git a/include/linux/vrange_types.h b/include/linux/vrange_types.h index bede336..c7154e4 100644 --- a/include/linux/vrange_types.h +++ b/include/linux/vrange_types.h @@ -7,6 +7,7 @@ struct vrange_root { struct rb_root v_rb; /* vrange rb tree */ struct mutex v_lock; /* Protect v_rb */ + enum {VRANGE_ANON, VRANGE_FILE} type; /* range root type */ }; diff --git a/mm/vrange.c b/mm/vrange.c index 9facbbc..671909c 100644 --- a/mm/vrange.c +++ b/mm/vrange.c @@ -14,6 +14,7 @@ #include #include #include +#include struct vrange_walker_private { struct zone *zone; @@ -234,6 +235,20 @@ void mm_exit_vrange(struct mm_struct *mm) } } +void mapping_exit_vrange(struct address_space *mapping) +{ + struct vrange *range; + struct rb_node *next; + + next = rb_first(&mapping->vroot.v_rb); + while (next) { + range = vrange_entry(next); + next = rb_next(next); + __remove_range(range); + put_vrange(range); + } +} + /* * The vrange(2) system call. * @@ -291,6 +306,51 @@ out: } +SYSCALL_DEFINE5(fvrange, int, fd, size_t, offset, + size_t, len, int, mode, int, behavior) +{ + struct fd f = fdget(fd); + struct address_space *mapping; + u64 start = offset; + u64 end; + int ret = -EINVAL; + + if (!f.file) + return -EBADF; + + if (S_ISFIFO(file_inode(f.file)->i_mode)) { + ret = -ESPIPE; + goto out; + } + + mapping = f.file->f_mapping; + if (!mapping || len < 0) { + ret = -EINVAL; + goto out; + } + + if (start & ~PAGE_MASK) + goto out; + + + len &= PAGE_MASK; + if (!len) + goto out; + + end = start + len; + if (end < start) + goto out; + + if (mode == VRANGE_VOLATILE) + ret = add_vrange(&mapping->vroot, start, end - 1); + else if (mode == VRANGE_NOVOLATILE) + ret = remove_vrange(&mapping->vroot, start, end - 1); +out: + fdput(f); + return ret; +} + + static bool __vrange_address(struct mm_struct *mm, unsigned long start, unsigned long end) { @@ -641,6 +701,9 @@ unsigned int discard_vrange(struct zone *zone, struct vrange *vrange, mm = vrange_get_owner_mm(vrange); + if (!mm) + goto out; + if (!down_read_trylock(&mm->mmap_sem)) goto out; @@ -683,6 +746,12 @@ static struct vrange *get_victim_vrange(void) list_for_each_prev_safe(cur, tmp, &lru_vrange) { vrange = list_entry(cur, struct vrange, lru); mm = vrange_get_owner_mm(vrange); + + if (!mm) { + vrange = NULL; + continue; + } + /* the process is exiting so pass it */ if (atomic_read(&mm->mm_users) == 0) { list_del_init(&vrange->lru); @@ -720,7 +789,8 @@ static void put_victim_range(struct vrange *vrange) struct mm_struct *mm = vrange_get_owner_mm(vrange); put_vrange(vrange); - mmdrop(mm); + if (mm) + mmdrop(mm); } unsigned int discard_vrange_pages(struct zone *zone, int nr_to_discard)