OpenVZ Forum


Home » Mailing lists » Devel » [PATCH v5 00/18] slab accounting for memcg
Re: [PATCH v5 06/18] consider a memcg parameter in kmem_create_cache [message #48622 is a reply to message #48576] Thu, 25 October 2012 13:42 Go to previous messageGo to next message
Glauber Costa is currently offline  Glauber Costa
Messages: 916
Registered: October 2011
Senior Member
On 10/23/2012 09:50 PM, JoonSoo Kim wrote:
>> -struct kmem_cache *__kmem_cache_alias(const char *name, size_t size,
>> > - size_t align, unsigned long flags, void (*ctor)(void *))
>> > +struct kmem_cache *
>> > +__kmem_cache_alias(struct mem_cgroup *memcg, const char *name, size_t size,
>> > + size_t align, unsigned long flags, void (*ctor)(void *))
>> > {
>> > struct kmem_cache *s;
>> >
>> > - s = find_mergeable(size, align, flags, name, ctor);
>> > + s = find_mergeable(memcg, size, align, flags, name, ctor);
>> > if (s) {
>> > s->refcount++;
>> > /*
> If your intention is that find_mergeable() works for memcg-slab-caches properly,
> it cannot works properly with this code.
> When memcg is not NULL, slab cache is only added to memcg's slab cache list.
> find_mergeable() only interate on original-slab-cache list.
> So memcg slab cache never be mergeable.

Actually, recent results made me reconsider this.

I split this in multiple lists so we could transverse the lists faster
for /proc/slabinfo.

Turns out, there are many places that will rely on the ability to scan
through *all* caches in the system (root or not). This is one (easily
fixable) example, but there are others, like the hotplug handlers.

That said, I don't think that /proc/slabinfo is *that* performance
sensitive, so it is better to just skip the non-root caches, and just
keep all caches in the global list.

Maybe we would still benefit from a memcg-side list, for example, when
we're destructing memcg, so I'll consider keeping that (with a list
field in memcg_params). But even for that one, is still doable to
transverse the whole list...
Re: [PATCH v5 08/18] memcg: infrastructure to match an allocation to the right cache [message #48629 is a reply to message #48617] Thu, 25 October 2012 18:06 Go to previous messageGo to next message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Hello, Glauber.

On Thu, Oct 25, 2012 at 03:05:22PM +0400, Glauber Costa wrote:
> > Is there any rmb() pair?
> > As far as I know, without rmb(), wmb() doesn't guarantee anything.
> >
>
> There should be. But it seems I missed it. Speaking of which, I should

You probably can use read_barrier_depends().

> wmb() after the NULL assignment in release cache as well.

And you probably don't need that. dangling pointer is protected by
RCU and there isn't any memory accesses which can paired with that
anyway.

Thanks.

--
tejun
Re: [PATCH v5 08/18] memcg: infrastructure to match an allocation to the right cache [message #48630 is a reply to message #48629] Thu, 25 October 2012 18:08 Go to previous messageGo to next message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Thu, Oct 25, 2012 at 11:06:52AM -0700, Tejun Heo wrote:
> Hello, Glauber.
>
> On Thu, Oct 25, 2012 at 03:05:22PM +0400, Glauber Costa wrote:
> > > Is there any rmb() pair?
> > > As far as I know, without rmb(), wmb() doesn't guarantee anything.
> > >
> >
> > There should be. But it seems I missed it. Speaking of which, I should
>
> You probably can use read_barrier_depends().

Ooh, and when you use memory barrier pairs, please make sure to
comment on both sides explaining how they're paired and why. There's
nothing more frustrating than barriers without clear annotation.

Thanks.

--
tejun
Re: [PATCH v5 11/18] sl[au]b: Allocate objects from memcg cache [message #48673 is a reply to message #48523] Mon, 29 October 2012 15:14 Go to previous messageGo to next message
JoonSoo Kim is currently offline  JoonSoo Kim
Messages: 9
Registered: September 2012
Junior Member
Hi, Glauber.

2012/10/19 Glauber Costa <glommer@parallels.com>:
> We are able to match a cache allocation to a particular memcg. If the
> task doesn't change groups during the allocation itself - a rare event,
> this will give us a good picture about who is the first group to touch a
> cache page.
>
> This patch uses the now available infrastructure by calling
> memcg_kmem_get_cache() before all the cache allocations.
>
> Signed-off-by: Glauber Costa <glommer@parallels.com>
> CC: Christoph Lameter <cl@linux.com>
> CC: Pekka Enberg <penberg@cs.helsinki.fi>
> CC: Michal Hocko <mhocko@suse.cz>
> CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Johannes Weiner <hannes@cmpxchg.org>
> CC: Suleiman Souhlal <suleiman@google.com>
> CC: Tejun Heo <tj@kernel.org>
> ---
> include/linux/slub_def.h | 15 ++++++++++-----
> mm/memcontrol.c | 3 +++
> mm/slab.c | 6 +++++-
> mm/slub.c | 5 +++--
> 4 files changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
> index 961e72e..ed330df 100644
> --- a/include/linux/slub_def.h
> +++ b/include/linux/slub_def.h
> @@ -13,6 +13,8 @@
> #include <linux/kobject.h>
>
> #include <linux/kmemleak.h>
> +#include <linux/memcontrol.h>
> +#include <linux/mm.h>
>
> enum stat_item {
> ALLOC_FASTPATH, /* Allocation from cpu slab */
> @@ -209,14 +211,14 @@ static __always_inline int kmalloc_index(size_t size)
> * This ought to end up with a global pointer to the right cache
> * in kmalloc_caches.
> */
> -static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
> +static __always_inline struct kmem_cache *kmalloc_slab(gfp_t flags, size_t size)
> {
> int index = kmalloc_index(size);
>
> if (index == 0)
> return NULL;
>
> - return kmalloc_caches[index];
> + return memcg_kmem_get_cache(kmalloc_caches[index], flags);
> }

You don't need this,
because memcg_kmem_get_cache() is invoked in both slab_alloc() and
__cache_alloc_node().
Re: [PATCH v5 11/18] sl[au]b: Allocate objects from memcg cache [message #48674 is a reply to message #48673] Mon, 29 October 2012 15:19 Go to previous messageGo to next message
Glauber Costa is currently offline  Glauber Costa
Messages: 916
Registered: October 2011
Senior Member
On 10/29/2012 07:14 PM, JoonSoo Kim wrote:
> Hi, Glauber.
>
> 2012/10/19 Glauber Costa <glommer@parallels.com>:
>> We are able to match a cache allocation to a particular memcg. If the
>> task doesn't change groups during the allocation itself - a rare event,
>> this will give us a good picture about who is the first group to touch a
>> cache page.
>>
>> This patch uses the now available infrastructure by calling
>> memcg_kmem_get_cache() before all the cache allocations.
>>
>> Signed-off-by: Glauber Costa <glommer@parallels.com>
>> CC: Christoph Lameter <cl@linux.com>
>> CC: Pekka Enberg <penberg@cs.helsinki.fi>
>> CC: Michal Hocko <mhocko@suse.cz>
>> CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> CC: Johannes Weiner <hannes@cmpxchg.org>
>> CC: Suleiman Souhlal <suleiman@google.com>
>> CC: Tejun Heo <tj@kernel.org>
>> ---
>> include/linux/slub_def.h | 15 ++++++++++-----
>> mm/memcontrol.c | 3 +++
>> mm/slab.c | 6 +++++-
>> mm/slub.c | 5 +++--
>> 4 files changed, 21 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h
>> index 961e72e..ed330df 100644
>> --- a/include/linux/slub_def.h
>> +++ b/include/linux/slub_def.h
>> @@ -13,6 +13,8 @@
>> #include <linux/kobject.h>
>>
>> #include <linux/kmemleak.h>
>> +#include <linux/memcontrol.h>
>> +#include <linux/mm.h>
>>
>> enum stat_item {
>> ALLOC_FASTPATH, /* Allocation from cpu slab */
>> @@ -209,14 +211,14 @@ static __always_inline int kmalloc_index(size_t size)
>> * This ought to end up with a global pointer to the right cache
>> * in kmalloc_caches.
>> */
>> -static __always_inline struct kmem_cache *kmalloc_slab(size_t size)
>> +static __always_inline struct kmem_cache *kmalloc_slab(gfp_t flags, size_t size)
>> {
>> int index = kmalloc_index(size);
>>
>> if (index == 0)
>> return NULL;
>>
>> - return kmalloc_caches[index];
>> + return memcg_kmem_get_cache(kmalloc_caches[index], flags);
>> }
>
> You don't need this,
> because memcg_kmem_get_cache() is invoked in both slab_alloc() and
> __cache_alloc_node().
>
Indeed, I had noticed this already, and fixed myself - to be sent in the
next version I intend to get out in the open tonight or tomorrow.
Re: [PATCH v5 13/18] memcg/sl[au]b Track all the memcg children of a kmem_cache. [message #48675 is a reply to message #48528] Mon, 29 October 2012 15:26 Go to previous messageGo to next message
JoonSoo Kim is currently offline  JoonSoo Kim
Messages: 9
Registered: September 2012
Junior Member
2012/10/19 Glauber Costa <glommer@parallels.com>:
> +void kmem_cache_destroy_memcg_children(struct kmem_cache *s)
> +{
> + struct kmem_cache *c;
> + int i;
> +
> + if (!s->memcg_params)
> + return;
> + if (!s->memcg_params->is_root_cache)
> + return;
> +
> + /*
> + * If the cache is being destroyed, we trust that there is no one else
> + * requesting objects from it. Even if there are, the sanity checks in
> + * kmem_cache_destroy should caught this ill-case.
> + *
> + * Still, we don't want anyone else freeing memcg_caches under our
> + * noses, which can happen if a new memcg comes to life. As usual,
> + * we'll take the set_limit_mutex to protect ourselves against this.
> + */
> + mutex_lock(&set_limit_mutex);
> + for (i = 0; i < memcg_limited_groups_array_size; i++) {
> + c = s->memcg_params->memcg_caches[i];
> + if (c)
> + kmem_cache_destroy(c);
> + }
> + mutex_unlock(&set_limit_mutex);
> +}

It may cause NULL deref.
Look at the following scenario.

1. some memcg slab caches has remained object.
2. start to destroy memcg.
3. schedule_delayed_work(kmem_cache_destroy_work_func, @delay 60hz)
4. all remained object is freed.
5. start to destroy root cache.
6. kmem_cache_destroy makes 's->memcg_params->memcg_caches[i]" NULL!!
7. Start delayed work function.
8. cachep in kmem_cache_destroy_work_func() may be NULL

Thanks.
Re: [PATCH v5 13/18] memcg/sl[au]b Track all the memcg children of a kmem_cache. [message #48679 is a reply to message #48675] Tue, 30 October 2012 11:31 Go to previous message
Glauber Costa is currently offline  Glauber Costa
Messages: 916
Registered: October 2011
Senior Member
On 10/29/2012 07:26 PM, JoonSoo Kim wrote:
> 2012/10/19 Glauber Costa <glommer@parallels.com>:
>> +void kmem_cache_destroy_memcg_children(struct kmem_cache *s)
>> +{
>> + struct kmem_cache *c;
>> + int i;
>> +
>> + if (!s->memcg_params)
>> + return;
>> + if (!s->memcg_params->is_root_cache)
>> + return;
>> +
>> + /*
>> + * If the cache is being destroyed, we trust that there is no one else
>> + * requesting objects from it. Even if there are, the sanity checks in
>> + * kmem_cache_destroy should caught this ill-case.
>> + *
>> + * Still, we don't want anyone else freeing memcg_caches under our
>> + * noses, which can happen if a new memcg comes to life. As usual,
>> + * we'll take the set_limit_mutex to protect ourselves against this.
>> + */
>> + mutex_lock(&set_limit_mutex);
>> + for (i = 0; i < memcg_limited_groups_array_size; i++) {
>> + c = s->memcg_params->memcg_caches[i];
>> + if (c)
>> + kmem_cache_destroy(c);
>> + }
>> + mutex_unlock(&set_limit_mutex);
>> +}
>
> It may cause NULL deref.
> Look at the following scenario.
>
> 1. some memcg slab caches has remained object.
> 2. start to destroy memcg.
> 3. schedule_delayed_work(kmem_cache_destroy_work_func, @delay 60hz)
> 4. all remained object is freed.
> 5. start to destroy root cache.
> 6. kmem_cache_destroy makes 's->memcg_params->memcg_caches[i]" NULL!!
> 7. Start delayed work function.
> 8. cachep in kmem_cache_destroy_work_func() may be NULL
>
> Thanks.
>
Thanks for spotting. This is the same problem we have in
memcg_cache_destroy(),
which I solved by not respawning the worker.

In here, I believe it should be possible to just cancel all remaining
pending work, since we are now in the process of deleting the caches
ourselves.
Previous Topic: [PATCH v3] SUNRPC: set desired file system root before connecting local transports
Next Topic: [PATCH v5] slab: Ignore internal flags in cache creation
Goto Forum:
  


Current Time: Tue Nov 26 05:36:00 GMT 2024

Total time taken to generate the page: 0.06251 seconds