From patchwork Sat Jul 28 03:57:09 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 10347 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 653B223E56 for ; Sat, 28 Jul 2012 03:58:38 +0000 (UTC) Received: from mail-ob0-f180.google.com (mail-ob0-f180.google.com [209.85.214.180]) by fiordland.canonical.com (Postfix) with ESMTP id 24135A1929E for ; Sat, 28 Jul 2012 03:58:38 +0000 (UTC) Received: by mail-ob0-f180.google.com with SMTP id uo19so5084033obb.11 for ; Fri, 27 Jul 2012 20:58:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:from:to:cc :subject:date:message-id:x-mailer:in-reply-to:references :x-content-scanned:x-cbid:x-gm-message-state; bh=M3xejoG3Lmr5FVqKyAIXZRj6LMsURMUj+2wd5I5skTU=; b=p4HkwPq5xFiaMg3uc4Q+v1QdsXRn0kS6LX2ZV/nZWQvsULOROgGzTZvYflHRk9wZWS l0zDsffwc6PK5204wteRMeHEd4jGqo3MTnVLPNWp2E6wJmICSQ9ppwdTjGPRhd8sxCAv ulJIvRNkve5flQ8LwAFjLtCV/hTAo/OeQWeJY0tNm9gEU6OXvRcnPiDMwt4/FwyyCSya jIWvZrUtxc/DJaECNNeUgg/tyC85RvldIv77UhavRYnPJ0BlIDa6Hefob2ykr/64RuhE oOtucnhEkoUv7xrrzUmSDqq8UDXnPuMJ612hAf2i/i3Ru8cVit9FzmZIkRolyb5/h+RR MRTg== Received: by 10.50.158.168 with SMTP id wv8mr3688097igb.11.1343447917460; Fri, 27 Jul 2012 20:58:37 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.87.40 with SMTP id u8csp1049igz; Fri, 27 Jul 2012 20:58:36 -0700 (PDT) Received: by 10.68.130.73 with SMTP id oc9mr14748996pbb.34.1343447916703; Fri, 27 Jul 2012 20:58:36 -0700 (PDT) Received: from e37.co.us.ibm.com (e37.co.us.ibm.com. [32.97.110.158]) by mx.google.com with ESMTPS id ky6si7518946pbc.318.2012.07.27.20.58.36 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 27 Jul 2012 20:58:36 -0700 (PDT) Received-SPF: neutral (google.com: 32.97.110.158 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) client-ip=32.97.110.158; Authentication-Results: mx.google.com; spf=neutral (google.com: 32.97.110.158 is neither permitted nor denied by best guess record for domain of john.stultz@linaro.org) smtp.mail=john.stultz@linaro.org Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 27 Jul 2012 21:58:35 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e37.co.us.ibm.com (192.168.1.137) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 27 Jul 2012 21:57:59 -0600 Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com [9.17.195.228]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id D2D5F1FF001C; Sat, 28 Jul 2012 03:57:56 +0000 (WET) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q6S3vx4c287828; Fri, 27 Jul 2012 21:57:59 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q6S3vvqj002315; Fri, 27 Jul 2012 21:57:58 -0600 Received: from kernel-pok.stglabs.ibm.com (kernel.stglabs.ibm.com [9.114.214.19]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q6S3vp0I002085; Fri, 27 Jul 2012 21:57:55 -0600 From: John Stultz To: LKML Cc: John Stultz , Andrew Morton , Android Kernel Team , Robert Love , Mel Gorman , Hugh Dickins , Dave Hansen , Rik van Riel , Dmitry Adamushko , Dave Chinner , Neil Brown , Andrea Righi , "Aneesh Kumar K.V" , Mike Hommey , Jan Kara , KOSAKI Motohiro , Michel Lespinasse , Minchan Kim , "linux-mm@kvack.org" Subject: [PATCH 2/5] [RFC] tmpfs: Add FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE handlers Date: Fri, 27 Jul 2012 23:57:09 -0400 Message-Id: <1343447832-7182-3-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.7.9.5 In-Reply-To: <1343447832-7182-1-git-send-email-john.stultz@linaro.org> References: <1343447832-7182-1-git-send-email-john.stultz@linaro.org> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12072803-7408-0000-0000-00000727FA2F X-Gm-Message-State: ALoCoQkdDWGFae88D4ZdWCm6wnyNiC4YEt8a+iJdvP0b5nci0GnLJVlhTQ1gPfIgLhj02Yjd9ap9 This patch enables FALLOC_FL_MARK_VOLATILE/UNMARK_VOLATILE functionality for tmpfs making use of the volatile range management code. Conceptually, FALLOC_FL_MARK_VOLATILE is like a delayed FALLOC_FL_PUNCH_HOLE. This allows applications that have data caches that can be re-created to tell the kernel that some memory contains data that is useful in the future, but can be recreated if needed, so if the kernel needs, it can zap the memory without having to swap it out. In use, applications use FALLOC_FL_MARK_VOLATILE to mark page ranges as volatile when they are not in use. Then later if they wants to reuse the data, they use FALLOC_FL_UNMARK_VOLATILE, which will return an error if the data has been purged. This is very much influenced by the Android Ashmem interface by Robert Love so credits to him and the Android developers. In many cases the code & logic come directly from the ashmem patch. The intent of this patch is to allow for ashmem-like behavior, but embeds the idea a little deeper into the VM code. This is a reworked version of the fadvise volatile idea submitted earlier to the list. Thanks to Dave Chinner for suggesting to rework the idea in this fashion. Also thanks to Dmitry Adamushko for continued review and bug reporting, and Dave Hansen for help with the original design and mentoring me in the VM code. v3: * Fix off by one issue when truncating page ranges * Use Dave Hansesn's suggestion to use shmem_writepage to trigger range purging instead of using a shrinker. v4: * Revert the shrinker removal, since writepage won't get called if we don't have swap. v5: * Cleanups CC: Andrew Morton CC: Android Kernel Team CC: Robert Love CC: Mel Gorman CC: Hugh Dickins CC: Dave Hansen CC: Rik van Riel CC: Dmitry Adamushko CC: Dave Chinner CC: Neil Brown CC: Andrea Righi CC: Aneesh Kumar K.V CC: Mike Hommey CC: Jan Kara CC: KOSAKI Motohiro CC: Michel Lespinasse CC: Minchan Kim CC: linux-mm@kvack.org Signed-off-by: John Stultz --- fs/open.c | 3 +- include/linux/falloc.h | 7 +-- mm/shmem.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+), 4 deletions(-) diff --git a/fs/open.c b/fs/open.c index 1e914b3..421a97c 100644 --- a/fs/open.c +++ b/fs/open.c @@ -223,7 +223,8 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len) return -EINVAL; /* Return error if mode is not supported */ - if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | + FALLOC_FL_MARK_VOLATILE | FALLOC_FL_UNMARK_VOLATILE)) return -EOPNOTSUPP; /* Punch hole must have keep size set */ diff --git a/include/linux/falloc.h b/include/linux/falloc.h index 73e0b62..3e47ad5 100644 --- a/include/linux/falloc.h +++ b/include/linux/falloc.h @@ -1,9 +1,10 @@ #ifndef _FALLOC_H_ #define _FALLOC_H_ -#define FALLOC_FL_KEEP_SIZE 0x01 /* default is extend size */ -#define FALLOC_FL_PUNCH_HOLE 0x02 /* de-allocates range */ - +#define FALLOC_FL_KEEP_SIZE 0x01 /* default is extend size */ +#define FALLOC_FL_PUNCH_HOLE 0x02 /* de-allocates range */ +#define FALLOC_FL_MARK_VOLATILE 0x04 /* mark range volatile */ +#define FALLOC_FL_UNMARK_VOLATILE 0x08 /* mark range non-volatile */ #ifdef __KERNEL__ /* diff --git a/mm/shmem.c b/mm/shmem.c index c15b998..e5ce04c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -64,6 +64,7 @@ static struct vfsmount *shm_mnt; #include #include #include +#include #include #include @@ -633,6 +634,103 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr) return error; } +static DEFINE_VOLATILE_FS_HEAD(shmem_volatile_head); + +static int shmem_mark_volatile(struct inode *inode, loff_t offset, loff_t len) +{ + pgoff_t start, end; + int ret; + + start = offset >> PAGE_CACHE_SHIFT; + end = (offset+len) >> PAGE_CACHE_SHIFT; + + volatile_range_lock(&shmem_volatile_head); + ret = volatile_range_add(&shmem_volatile_head, &inode->i_data, + start, end); + if (ret > 0) { /* immdiately purge */ + shmem_truncate_range(inode, + ((loff_t) start << PAGE_CACHE_SHIFT), + ((loff_t) end << PAGE_CACHE_SHIFT)-1); + ret = 0; + } + volatile_range_unlock(&shmem_volatile_head); + + return ret; +} + +static int shmem_unmark_volatile(struct inode *inode, loff_t offset, loff_t len) +{ + pgoff_t start, end; + int ret; + + start = offset >> PAGE_CACHE_SHIFT; + end = (offset+len) >> PAGE_CACHE_SHIFT; + + volatile_range_lock(&shmem_volatile_head); + ret = volatile_range_remove(&shmem_volatile_head, &inode->i_data, + start, end); + volatile_range_unlock(&shmem_volatile_head); + + return ret; +} + +static void shmem_clear_volatile(struct inode *inode) +{ + volatile_range_lock(&shmem_volatile_head); + volatile_range_clear(&shmem_volatile_head, &inode->i_data); + volatile_range_unlock(&shmem_volatile_head); +} + +static +int shmem_volatile_shrink(struct shrinker *ignored, struct shrink_control *sc) +{ + s64 nr_to_scan = sc->nr_to_scan; + const gfp_t gfp_mask = sc->gfp_mask; + struct address_space *mapping; + pgoff_t start, end; + int ret; + s64 page_count; + + if (nr_to_scan && !(gfp_mask & __GFP_FS)) + return -1; + + volatile_range_lock(&shmem_volatile_head); + page_count = volatile_range_lru_size(&shmem_volatile_head); + if (!nr_to_scan) + goto out; + + do { + ret = volatile_ranges_pluck_lru(&shmem_volatile_head, + &mapping, &start, &end); + if (ret) { + shmem_truncate_range(mapping->host, + ((loff_t) start << PAGE_CACHE_SHIFT), + ((loff_t) end << PAGE_CACHE_SHIFT)-1); + + nr_to_scan -= end-start; + page_count -= end-start; + }; + } while (ret && (nr_to_scan > 0)); + +out: + volatile_range_unlock(&shmem_volatile_head); + + return page_count; +} + +static struct shrinker shmem_volatile_shrinker = { + .shrink = shmem_volatile_shrink, + .seeks = DEFAULT_SEEKS, +}; + +static int __init shmem_shrinker_init(void) +{ + register_shrinker(&shmem_volatile_shrinker); + return 0; +} +arch_initcall(shmem_shrinker_init); + + static void shmem_evict_inode(struct inode *inode) { struct shmem_inode_info *info = SHMEM_I(inode); @@ -1730,6 +1828,14 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, /* No need to unmap again: hole-punching leaves COWed pages */ error = 0; goto out; + } else if (mode & FALLOC_FL_MARK_VOLATILE) { + /* Mark pages volatile, sort of delayed hole punching */ + error = shmem_mark_volatile(inode, offset, len); + goto out; + } else if (mode & FALLOC_FL_UNMARK_VOLATILE) { + /* Mark pages non-volatile, return error if pages were purged */ + error = shmem_unmark_volatile(inode, offset, len); + goto out; } /* We need to check rlimit even when FALLOC_FL_KEEP_SIZE */ @@ -1808,6 +1914,12 @@ out: return error; } +static int shmem_release(struct inode *inode, struct file *file) +{ + shmem_clear_volatile(inode); + return 0; +} + static int shmem_statfs(struct dentry *dentry, struct kstatfs *buf) { struct shmem_sb_info *sbinfo = SHMEM_SB(dentry->d_sb); @@ -2719,6 +2831,7 @@ static const struct file_operations shmem_file_operations = { .splice_read = shmem_file_splice_read, .splice_write = generic_file_splice_write, .fallocate = shmem_fallocate, + .release = shmem_release, #endif };