OpenVZ Forum


Home » Mailing lists » Devel » [PATCH] task containersv11 add tasks file interface fix for cpusets
Re: [PATCH] task containersv11 add tasks file interface fix for cpusets [message #21649 is a reply to message #21604] Fri, 12 October 2007 15:13 Go to previous messageGo to previous message
David Rientjes is currently offline  David Rientjes
Messages: 59
Registered: November 2006
Member
On Thu, 11 Oct 2007, Paul Jackson wrote:

> Hmmm ... I hadn't noticed that sched_hotcpu_mutex before.
> 
> I wonder what it is guarding?  As best as I can guess, it seems, at
> least in part, to be keeping the following two items consistent:
>  1) cpu_online_map

Yes, it protects against cpu hot-plug or hot-unplug; cpu_online_map is 
guaranteed to be unchanged while the mutex is being held.

>  2) the per-task cpus_allowed masks
> 

It doesn't need to protect the per-task cpus_allowed per se, that's 
already protected.  If a task's cpu affinity changes during a call to 
set_cpus_allowed(), the migration thread will notice the change when it 
tries to deactive the task and activate it on the destination cpu.  It 
then becomes a no-op.

That's a consequence of the fact that we can't migrate current and need a 
kthread, particularly the source cpu's runqueue migration thread, to do it 
when it's scheduled.  A migration request such as that includes a 
completion variable so that the set_cpus_allowed() waits until it has 
either been migrated or changed cpu affinity again.

> That is, it seems to ensure that a task is allowed to run on some
> online CPU.
> 

Right, the destination cpu will not be hot-unplugged out from underneath 
the task during migration.

> If that's approximately true, then shouldn't I take sched_hotcpu_mutex
> around the entire chunk of code that handles updating a cpusets 'cpus',
> from the time it verifies that the requested CPUs are online, until the
> time that every affected task has its cpus_allowed updated?
> 

Not necessarily, you can iterate through a list of tasks and change their 
cpu affinity (represented by task->cpus_allowed) by migrating them away 
while task->cpuset->cpus_allowed remains unchanged.  The hotcpu notifier 
cpuset_handle_cpuhp() will update that when necessary for cpu hot-plug or 
hot-unplug events.

So it's entirely possible that a cpu will be downed during your iteration 
of tasks, but that's fine.  Just as long as it isn't downed during the 
migration.  The cpuset's cpus_allowed will be updated by the hotcpu 
notifier and sched_hotcpu_mutex will protect from unplugged cpus around 
the set_cpus_allowed() call, which checks for intersection between your 
new cpumask and cpu_online_map.

> Furthermore, I should probably guard changes to and verifications
> against the top_cpuset's cpus_allowed with this mutex as well, as it is
> supposed to be a copy of cpu_online_map.
> 

The hotcpu notifier protects you there as well.  
common_cpu_mem_hotplug_unplug() explicitly sets them.

> And since all descendent cpusets have to have 'cpus' masks that are
> subsets of their parents, this means guarding other chunks of cpuset
> code that depend on the consistency of various per-cpuset cpus_allowed
> masks and cpu_online_map.
> 

Same as above, except now you're using 
guarantee_online_cpus_mems_in_subtree().

> My current intuition is that this expanded use of sched_hotcpu_mutex in
> the cpuset code involving various cpus_allowed masks would be a good
> thing.
> 
> In sum, perhaps sched_hotcpu_mutex is guarding the dispersed kernel
> state that depends on what CPUs are online.  This includes the per-task
> and per-cpuset cpus_allowed masks, all of which are supposed to be some
> non-empty subset of the online CPUs.
> 

It guards cpu_online_map from being changed while it's held.

> Taking and dropping the sched_hotcpu_mutex for each task, just around
> the call to set_cpus_allowed(), as you suggested above, doesn't seem to
> accomplish much that I can see, and certainly doesn't seem to guard the
> consistency of cpu_online_map with the tasks cpus_allowed masks.
> 

It's needed to serialize with other migrations such as sched_setaffinity() 
and you can use it since all migrations will inherently need this type of 
protection.  It makes the new cpumask consistent with cpu_online_map only 
so far as that it's a subset; otherwise, set_cpus_allowed() will fail.  
The particular destination cpu is chosen as any online cpu, which we know 
won't be downed because we're holding sched_hotcpu_mutex.

		David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: [patch 0/2][NETNS49][IPV4][UDPLITE] make udplite protocol per namespace
Next Topic: [PATCH 0/9] Consolidate IP fragment management
Goto Forum:
  


Current Time: Sun Oct 26 14:13:32 GMT 2025

Total time taken to generate the page: 0.14261 seconds