OpenVZ Forum: Devel » [PATCH v3 00/13] kmem controller for memcg.

Home » Mailing lists » Devel » [PATCH v3 00/13] kmem controller for memcg.

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback [message #48209 is a reply to message #48205]

Thu, 04 October 2012 14:20

Glauber Costa
Messages: 916
Registered: October 2011

Senior Member

On 10/04/2012 02:53 PM, Glauber Costa wrote:
> On 10/01/2012 05:27 PM, Michal Hocko wrote:
>> On Tue 18-09-12 18:04:09, Glauber Costa wrote:
>>> A lot of the initialization we do in mem_cgroup_create() is done with softirqs
>>> enabled. This include grabbing a css id, which holds &ss->id_lock->rlock, and
>>> the per-zone trees, which holds rtpz->lock->rlock. All of those signal to the
>>> lockdep mechanism that those locks can be used in SOFTIRQ-ON-W context. This
>>> means that the freeing of memcg structure must happen in a compatible context,
>>> otherwise we'll get a deadlock.
>>
>> Maybe I am missing something obvious but why cannot we simply disble
>> (soft)irqs in mem_cgroup_create rather than make the free path much more
>> complicated. It really feels strange to defer everything (e.g. soft
>> reclaim tree cleanup which should be a no-op at the time because there
>> shouldn't be any user pages in the group).
>>
>
> Ok.
>
> I was just able to come back to this today - I was mostly working on the
> slab feedback over the past few days. I will answer yours and Tejun's
> concerns at once:
>
> Here is the situation: the backtrace I get is this one:
>
> [ 124.956725] =================================
> [ 124.957217] [ INFO: inconsistent lock state ]
> [ 124.957217] 3.5.0+ #99 Not tainted
> [ 124.957217] ---------------------------------
> [ 124.957217] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> [ 124.957217] ksoftirqd/0/3 [HC0[0]:SC1[1]:HE1:SE0] takes:
> [ 124.957217] (&(&ss->id_lock)->rlock){+.?...}, at:
> [<ffffffff810aa7b2>] spin_lock+0x9/0xb
> [ 124.957217] {SOFTIRQ-ON-W} state was registered at:
> [ 124.957217] [<ffffffff810996ed>] __lock_acquire+0x31f/0xd68
> [ 124.957217] [<ffffffff8109a660>] lock_acquire+0x108/0x15c
> [ 124.957217] [<ffffffff81534ec4>] _raw_spin_lock+0x40/0x4f
> [ 124.957217] [<ffffffff810aa7b2>] spin_lock+0x9/0xb
> [ 124.957217] [<ffffffff810ad00e>] get_new_cssid+0x69/0xf3
> [ 124.957217] [<ffffffff810ad0da>] cgroup_init_idr+0x42/0x60
> [ 124.957217] [<ffffffff81b20e04>] cgroup_init+0x50/0x100
> [ 124.957217] [<ffffffff81b05b9b>] start_kernel+0x3b9/0x3ee
> [ 124.957217] [<ffffffff81b052d6>] x86_64_start_reservations+0xb1/0xb5
> [ 124.957217] [<ffffffff81b053d8>] x86_64_start_kernel+0xfe/0x10b
>
>
> So what we learn from it, is: we are acquiring a specific lock (the css
> id one) from softirq context. It was previously taken in a
> softirq-enabled context, that seems to be coming directly from
> get_new_cssid.
>
> Tejun correctly pointed out that we should never acquire that lock from
> a softirq context, in which he is right.
>
> But the situation changes slightly with kmem. Now, the following excerpt
> of a backtrace is possible:
>
> [ 48.602775] [<ffffffff81103095>] free_accounted_pages+0x47/0x4c
> [ 48.602775] [<ffffffff81047f90>] free_task+0x31/0x5c
> [ 48.602775] [<ffffffff8104807d>] __put_task_struct+0xc2/0xdb
> [ 48.602775] [<ffffffff8104dfc7>] put_task_struct+0x1e/0x22
> [ 48.602775] [<ffffffff8104e144>] delayed_put_task_struct+0x7a/0x98
> [ 48.602775] [<ffffffff810cf0e5>] __rcu_process_callbacks+0x269/0x3df
> [ 48.602775] [<ffffffff810cf28c>] rcu_process_callbacks+0x31/0x5b
> [ 48.602775] [<ffffffff8105266d>] __do_softirq+0x122/0x277
>
> So as you can see, free_accounted_pages (that will trigger a memcg_put()
> -> mem_cgroup_free()) can now be called from softirq context, which is,
> an rcu callback (and I just realized I wrote the exact opposite in the
> subj line: man, I really suck at that!!)
> As a matter of fact, we could not move to our rcu callback as well:
>
> we need to move it to a worker thread with the rest.
>
> We already have a worker thread: he reason we have it is not
> static_branches: The reason is vfree(), that will BUG_ON(in_interrupt())
> and could not be called from rcu callback as well. We moved static
> branches in there as well for a similar problem, but haven't introduced it.
>
> Could we move just part of it to the worker thread? Absolutely yes.
> Moving just free_css_id() is enough to make it work. But since it is not
> the first context related problem we had, I thought: "to hell with that,
> let's move everything and be safe".
>
> I am fine moving free_css_id() only if you would prefer.
>
> Can we disable softirqs when we initialize css_id? Maybe. My machine
> seems to boot fine and survive the simple workload that would trigger
> that bug if I use irqsave spinlocks instead of normal spinlocks. But
> this has to be done from cgroup core: We have no control over css
> creation in memcg.
>
> How would you guys like me to handle this ?

One more thing: As I mentioned in the Changelog,
mem_cgroup_remove_exceeded(), called from mem_cgroup_remove_from_trees()
will lead to the same usage pattern.

Report message to a moderator

[Message index]

		[PATCH v3 00/13] kmem controller for memcg. By: Glauber Costa on Tue, 18 September 2012 14:03
		[PATCH v3 03/13] memcg: change defines to an enum By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 03/13] memcg: change defines to an enum By: Johannes Weiner on Mon, 01 October 2012 19:06
		Re: [PATCH v3 03/13] memcg: change defines to an enum By: Glauber Costa on Tue, 02 October 2012 09:10
		[PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Tejun Heo on Fri, 21 September 2012 17:23
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Glauber Costa on Mon, 24 September 2012 08:48
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Michal Hocko on Mon, 01 October 2012 13:27
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Glauber Costa on Thu, 04 October 2012 10:53
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Glauber Costa on Thu, 04 October 2012 14:20
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Johannes Weiner on Fri, 05 October 2012 15:31
		Re: [PATCH v3 12/13] execute the whole memcg freeing in rcu callback By: Glauber Costa on Mon, 08 October 2012 09:45
		[PATCH v3 01/13] memcg: Make it possible to use the stock for more than one page. By: Glauber Costa on Tue, 18 September 2012 14:03
		Re: [PATCH v3 01/13] memcg: Make it possible to use the stock for more than one page. By: Johannes Weiner on Mon, 01 October 2012 18:48
		[PATCH v3 07/13] mm: Allocate kernel pages to the right memcg By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg By: Mel Gorman on Thu, 27 September 2012 13:50
		Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg By: Glauber Costa on Fri, 28 September 2012 09:43
		Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg By: Mel Gorman on Fri, 28 September 2012 13:28
		Re: [PATCH v3 07/13] mm: Allocate kernel pages to the right memcg By: Michal Hocko on Thu, 27 September 2012 13:52
		[PATCH v3 10/13] memcg: use static branches when code not in use By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 10/13] memcg: use static branches when code not in use By: Michal Hocko on Mon, 01 October 2012 12:25
		Re: [PATCH v3 10/13] memcg: use static branches when code not in use By: Glauber Costa on Mon, 01 October 2012 12:27
		[PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Fri, 21 September 2012 08:41
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: JoonSoo Kim on Fri, 21 September 2012 09:14
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: JoonSoo Kim on Thu, 20 September 2012 16:05
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Wed, 26 September 2012 15:51
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Thu, 27 September 2012 11:31
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Thu, 27 September 2012 13:44
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Fri, 28 September 2012 11:34
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Tejun Heo on Sun, 30 September 2012 08:25
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Mon, 01 October 2012 08:28
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Tejun Heo on Wed, 03 October 2012 22:11
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Mon, 01 October 2012 09:44
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Mon, 01 October 2012 09:48
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Mon, 01 October 2012 10:09
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Mon, 01 October 2012 11:51
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Mon, 01 October 2012 11:51
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Michal Hocko on Mon, 01 October 2012 11:58
		Re: [PATCH v3 06/13] memcg: kmem controller infrastructure By: Glauber Costa on Mon, 01 October 2012 12:04
		[PATCH v3 11/13] memcg: allow a memcg with kmem charges to be destructed. By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 11/13] memcg: allow a memcg with kmem charges to be destructed. By: Michal Hocko on Mon, 01 October 2012 12:30
		[PATCH v3 09/13] memcg: kmem accounting lifecycle management By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management By: Michal Hocko on Mon, 01 October 2012 12:15
		Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management By: Glauber Costa on Mon, 01 October 2012 12:29
		Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management By: Michal Hocko on Mon, 01 October 2012 12:36
		Re: [PATCH v3 09/13] memcg: kmem accounting lifecycle management By: Glauber Costa on Mon, 01 October 2012 12:43
		[PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Rik van Riel on Tue, 18 September 2012 14:15
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Christoph Lameter on Tue, 18 September 2012 15:06
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Glauber Costa on Wed, 19 September 2012 07:39
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Christoph Lameter on Wed, 19 September 2012 14:07
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Mel Gorman on Thu, 27 September 2012 13:34
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Glauber Costa on Thu, 27 September 2012 13:41
		Re: [PATCH v3 05/13] Add a __GFP_KMEMCG flag By: Johannes Weiner on Mon, 01 October 2012 19:09
		[PATCH v3 13/13] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs By: Glauber Costa on Tue, 18 September 2012 14:04
		Re: [PATCH v3 13/13] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs By: Michal Hocko on Mon, 01 October 2012 13:17

Previous Topic:	[RFC PATCH 0/5] net: socket bind to file descriptor introduced
Next Topic:	[RFC] Posix timers improvements, requied for CRIU project

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Sat Oct 11 01:04:34 GMT 2025

Total time taken to generate the page: 0.10429 seconds