Message ID | 20151104123640.GK7637@e104818-lin.cambridge.arm.com |
---|---|
State | New |
Headers | show |
On Wed, Nov 04, 2015 at 07:53:50AM -0600, Christoph Lameter wrote: > On Wed, 4 Nov 2015, Catalin Marinas wrote: > > > The simplest option would be to make sure that off slab isn't allowed > > for caches of KMALLOC_MIN_SIZE or smaller, with the drawback that not > > only "kmalloc-128" but any other such caches will be on slab. > > The reason for an off slab configuration is denser object packing. > > > I think a better option would be to first check that there is a > > kmalloc_caches[] entry for freelist_size before deciding to go off-slab. > > Hmmm.. Yes seems to be an option. > > Maybe we simply revert commit 8fc9cf420b36 instead? I'm fine with this. Also note that the arm64 commit changing L1_CACHE_BYTES to 128 hasn't been pushed yet (it's queued for 4.4). > That does not seem to make too much sense to me and the goal of the > commit cannot be accomplished on ARM. Your patch essentially reverts > the effect anyways. In theory it only reverts the effect for the first kmalloc_cache ("kmalloc-128" in the arm64 case). Any other bigger cache which would not be mergeable with an existing one still has the potential of off-slab management. > Smaller slabs really do not need off slab management anyways since they > will only loose a few objects per slab page. IIUC, starting with 128 slab size for a 4KB page, you have 32 objects per page. The freelist takes 32 bytes (or 31), therefore you waste a single slab object. However, only 1/4 of it is used for freelist and the waste gets bigger with 256 slab size, hence the original commit. BTW, assuming L1_CACHE_BYTES is 512 (I don't ever see this happening but just in theory), we potentially have the same issue. What would save us is that INDEX_NODE would match the first "kmalloc-512" cache, so we have it pre-populated. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/mm/slab.c b/mm/slab.c index 4fcc5dd8d5a6..d4a21736eb5d 100644 --- a/mm/slab.c +++ b/mm/slab.c @@ -2246,16 +2246,33 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) if (flags & CFLGS_OFF_SLAB) { /* really off slab. No need for manual alignment */ - freelist_size = calculate_freelist_size(cachep->num, 0); + size_t off_freelist_size = calculate_freelist_size(cachep->num, 0); + + cachep->freelist_cache = kmalloc_slab(off_freelist_size, 0u); + if (ZERO_OR_NULL_PTR(cachep->freelist_cache)) { + /* + * We don't have kmalloc_caches[] populated for + * off_freelist_size yet. This can happen during + * create_kmalloc_caches() when KMALLOC_MIN_SIZE >= + * (PAGE_SHIFT >> 5) and CFLGS_OFF_SLAB is set. Move + * the cache on-slab. + */ + flags &= ~CFLGS_OFF_SLAB; + left_over = calculate_slab_order(cachep, size, cachep->align, flags); + } else { + freelist_size = off_freelist_size; #ifdef CONFIG_PAGE_POISONING - /* If we're going to use the generic kernel_map_pages() - * poisoning, then it's going to smash the contents of - * the redzone and userword anyhow, so switch them off. - */ - if (size % PAGE_SIZE == 0 && flags & SLAB_POISON) - flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER); + /* + * If we're going to use the generic kernel_map_pages() + * poisoning, then it's going to smash the contents of + * the redzone and userword anyhow, so switch them off. + */ + if (size % PAGE_SIZE == 0 && flags & SLAB_POISON) + flags &= ~(SLAB_RED_ZONE | SLAB_STORE_USER); #endif + } + } cachep->colour_off = cache_line_size(); @@ -2271,18 +2288,6 @@ __kmem_cache_create (struct kmem_cache *cachep, unsigned long flags) cachep->size = size; cachep->reciprocal_buffer_size = reciprocal_value(size); - if (flags & CFLGS_OFF_SLAB) { - cachep->freelist_cache = kmalloc_slab(freelist_size, 0u); - /* - * This is a possibility for one of the kmalloc_{dma,}_caches. - * But since we go off slab only for object size greater than - * PAGE_SIZE/8, and kmalloc_{dma,}_caches get created - * in ascending order,this should not happen at all. - * But leave a BUG_ON for some lucky dude. - */ - BUG_ON(ZERO_OR_NULL_PTR(cachep->freelist_cache)); - } - err = setup_cpu_cache(cachep, gfp); if (err) { __kmem_cache_shutdown(cachep);