Merge branch 'linus' into release

Conflicts: arch/x86/kernel/cpu/cpufreq/longhaul.c Signed-off-by: Len Brown <len.brown@intel.com>
author: Len Brown <len.brown@intel.com> 2009-04-05 02:14:15 -0400
committer: Len Brown <len.brown@intel.com> 2009-04-05 02:14:15 -0400
commit: 478c6a43fcbc6c11609f8cee7c7b57223907754f (patch)
tree: a7f7952099da60d33032aed6de9c0c56c9f8779e /Documentation/cgroups
parent: 8a3f257c704e02aee9869decd069a806b45be3f1 (diff)
parent: 6bb597507f9839b13498781e481f5458aea33620 (diff)
download: linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.tar.gz
linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.tar.bz2
linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.zip
6 files changed, 73 insertions, 19 deletions
diff --git a/Documentation/cgroups/00-INDEX b/Documentation/cgroups/00-INDEX
new file mode 100644
index 00000000000..3f58fa3d6d0
--- /dev/null
+++ b/Documentation/cgroups/00-INDEX
@@ -0,0 +1,18 @@
+00-INDEX
+	- this file
+cgroups.txt
+	- Control Groups definition, implementation details, examples and API.
+cpuacct.txt
+	- CPU Accounting Controller; account CPU usage for groups of tasks.
+cpusets.txt
+	- documents the cpusets feature; assign CPUs and Mem to a set of tasks.
+devices.txt
+	- Device Whitelist Controller; description, interface and security.
+freezer-subsystem.txt
+	- checkpointing; rationale to not use signals, interface.
+memcg_test.txt
+	- Memory Resource Controller; implementation details.
+memory.txt
+	- Memory Resource Controller; design, accounting, interface, testing.
+resource_counter.txt
+	- Resource Counter API.
diff --git a/Documentation/cgroups/cgroups.txt b/Documentation/cgroups/cgroups.txt
index 93feb844448..6eb1a97e88c 100644
--- a/Documentation/cgroups/cgroups.txt
+++ b/Documentation/cgroups/cgroups.txt
@@ -56,7 +56,7 @@ hierarchy, and a set of subsystems; each subsystem has system-specific
 state attached to each cgroup in the hierarchy.  Each hierarchy has
 an instance of the cgroup virtual filesystem associated with it.
 
-At any one time there may be multiple active hierachies of task
+At any one time there may be multiple active hierarchies of task
 cgroups. Each hierarchy is a partition of all tasks in the system.
 
 User level code may create and destroy cgroups by name in an
@@ -124,10 +124,10 @@ following lines:
                                / \
                        Prof (15%) students (5%)
 
-Browsers like firefox/lynx go into the WWW network class, while (k)nfsd go
+Browsers like Firefox/Lynx go into the WWW network class, while (k)nfsd go
 into NFS network class.
 
-At the same time firefox/lynx will share an appropriate CPU/Memory class
+At the same time Firefox/Lynx will share an appropriate CPU/Memory class
 depending on who launched it (prof/student).
 
 With the ability to classify tasks differently for different resources
@@ -325,7 +325,7 @@ and then start a subshell 'sh' in that cgroup:
 Creating, modifying, using the cgroups can be done through the cgroup
 virtual filesystem.
 
-To mount a cgroup hierarchy will all available subsystems, type:
+To mount a cgroup hierarchy with all available subsystems, type:
 # mount -t cgroup xxx /dev/cgroup
 
 The "xxx" is not interpreted by the cgroup code, but will appear in
@@ -333,12 +333,23 @@ The "xxx" is not interpreted by the cgroup code, but will appear in
 
 To mount a cgroup hierarchy with just the cpuset and numtasks
 subsystems, type:
-# mount -t cgroup -o cpuset,numtasks hier1 /dev/cgroup
+# mount -t cgroup -o cpuset,memory hier1 /dev/cgroup
 
 To change the set of subsystems bound to a mounted hierarchy, just
 remount with different options:
+# mount -o remount,cpuset,ns hier1 /dev/cgroup
 
-# mount -o remount,cpuset,ns  /dev/cgroup
+Now memory is removed from the hierarchy and ns is added.
+
+Note this will add ns to the hierarchy but won't remove memory or
+cpuset, because the new options are appended to the old ones:
+# mount -o remount,ns /dev/cgroup
+
+To Specify a hierarchy's release_agent:
+# mount -t cgroup -o cpuset,release_agent="/sbin/cpuset_release_agent" \
+  xxx /dev/cgroup
+
+Note that specifying 'release_agent' more than once will return failure.
 
 Note that changing the set of subsystems is currently only supported
 when the hierarchy consists of a single (root) cgroup. Supporting
@@ -349,6 +360,11 @@ Then under /dev/cgroup you can find a tree that corresponds to the
 tree of the cgroups in the system. For instance, /dev/cgroup
 is the cgroup that holds the whole system.
 
+If you want to change the value of release_agent:
+# echo "/sbin/new_release_agent" > /dev/cgroup/release_agent
+
+It can also be changed via remount.
+
 If you want to create a new cgroup under /dev/cgroup:
 # cd /dev/cgroup
 # mkdir my_cgroup
@@ -476,11 +492,13 @@ cgroup->parent is still valid. (Note - can also be called for a
 newly-created cgroup if an error occurs after this subsystem's
 create() method has been called for the new cgroup).
 
-void pre_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp);
+int pre_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp);
 
 Called before checking the reference count on each subsystem. This may
 be useful for subsystems which have some extra references even if
-there are not tasks in the cgroup.
+there are not tasks in the cgroup. If pre_destroy() returns error code,
+rmdir() will fail with it. From this behavior, pre_destroy() can be
+called multiple times against a cgroup.
 
 int can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
 	       struct task_struct *task)
@@ -521,7 +539,7 @@ always handled well.
 void post_clone(struct cgroup_subsys *ss, struct cgroup *cgrp)
 (cgroup_mutex held by caller)
 
-Called at the end of cgroup_clone() to do any paramater
+Called at the end of cgroup_clone() to do any parameter
 initialization which might be required before a task could attach.  For
 example in cpusets, no task may attach before 'cpus' and 'mems' are set
 up.
diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt
index 0611e9528c7..f9ca389dddf 100644
--- a/Documentation/cgroups/cpusets.txt
+++ b/Documentation/cgroups/cpusets.txt
@@ -131,7 +131,7 @@ Cpusets extends these two mechanisms as follows:
  - The hierarchy of cpusets can be mounted at /dev/cpuset, for
    browsing and manipulation from user space.
  - A cpuset may be marked exclusive, which ensures that no other
-   cpuset (except direct ancestors and descendents) may contain
+   cpuset (except direct ancestors and descendants) may contain
    any overlapping CPUs or Memory Nodes.
  - You can list all the tasks (by pid) attached to any cpuset.
 
@@ -226,7 +226,7 @@ nodes with memory--using the cpuset_track_online_nodes() hook.
 --------------------------------
 
 If a cpuset is cpu or mem exclusive, no other cpuset, other than
-a direct ancestor or descendent, may share any of the same CPUs or
+a direct ancestor or descendant, may share any of the same CPUs or
 Memory Nodes.
 
 A cpuset that is mem_exclusive *or* mem_hardwall is "hardwalled",
@@ -427,7 +427,7 @@ child cpusets have this flag enabled.
 When doing this, you don't usually want to leave any unpinned tasks in
 the top cpuset that might use non-trivial amounts of CPU, as such tasks
 may be artificially constrained to some subset of CPUs, depending on
-the particulars of this flag setting in descendent cpusets.  Even if
+the particulars of this flag setting in descendant cpusets.  Even if
 such a task could use spare CPU cycles in some other CPUs, the kernel
 scheduler might not consider the possibility of load balancing that
 task to that underused CPU.
@@ -531,9 +531,9 @@ be idle.
 
 Of course it takes some searching cost to find movable tasks and/or
 idle CPUs, the scheduler might not search all CPUs in the domain
-everytime.  In fact, in some architectures, the searching ranges on
+every time.  In fact, in some architectures, the searching ranges on
 events are limited in the same socket or node where the CPU locates,
-while the load balance on tick searchs all.
+while the load balance on tick searches all.
 
 For example, assume CPU Z is relatively far from CPU X.  Even if CPU Z
 is idle while CPU X and the siblings are busy, scheduler can't migrate
@@ -601,7 +601,7 @@ its new cpuset, then the task will continue to use whatever subset
 of MPOL_BIND nodes are still allowed in the new cpuset.  If the task
 was using MPOL_BIND and now none of its MPOL_BIND nodes are allowed
 in the new cpuset, then the task will be essentially treated as if it
-was MPOL_BIND bound to the new cpuset (even though its numa placement,
+was MPOL_BIND bound to the new cpuset (even though its NUMA placement,
 as queried by get_mempolicy(), doesn't change).  If a task is moved
 from one cpuset to another, then the kernel will adjust the tasks
 memory placement, as above, the next time that the kernel attempts
diff --git a/Documentation/cgroups/devices.txt b/Documentation/cgroups/devices.txt
index 7cc6e6a6067..57ca4c89fe5 100644
--- a/Documentation/cgroups/devices.txt
+++ b/Documentation/cgroups/devices.txt
@@ -42,7 +42,7 @@ suffice, but we can decide the best way to adequately restrict
 movement as people get some experience with this.  We may just want
 to require CAP_SYS_ADMIN, which at least is a separate bit from
 CAP_MKNOD.  We may want to just refuse moving to a cgroup which
-isn't a descendent of the current one.  Or we may want to use
+isn't a descendant of the current one.  Or we may want to use
 CAP_MAC_ADMIN, since we really are trying to lock down root.
 
 CAP_SYS_ADMIN is needed to modify the whitelist or move another
diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt
index 523a9c16c40..72db89ed060 100644
--- a/Documentation/cgroups/memcg_test.txt
+++ b/Documentation/cgroups/memcg_test.txt
@@ -1,5 +1,5 @@
 Memory Resource Controller(Memcg)  Implementation Memo.
-Last Updated: 2009/1/19
+Last Updated: 2009/1/20
 Base Kernel Version: based on 2.6.29-rc2.
 
 Because VM is getting complex (one of reasons is memcg...), memcg's behavior
@@ -356,7 +356,25 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
 	(Shell-B)
 	# move all tasks in /cgroup/test to /cgroup
 	# /sbin/swapoff -a
-	# rmdir /test/cgroup
+	# rmdir /cgroup/test
 	# kill malloc task.
 
 	Of course, tmpfs v.s. swapoff test should be tested, too.
+
+ 9.8 OOM-Killer
+	Out-of-memory caused by memcg's limit will kill tasks under
+	the memcg. When hierarchy is used, a task under hierarchy
+	will be killed by the kernel.
+	In this case, panic_on_oom shouldn't be invoked and tasks
+	in other groups shouldn't be killed.
+
+	It's not difficult to cause OOM under memcg as following.
+	Case A) when you can swapoff
+	#swapoff -a
+	#echo 50M > /memory.limit_in_bytes
+	run 51M of malloc
+
+	Case B) when you use mem+swap limitation.
+	#echo 50M > memory.limit_in_bytes
+	#echo 50M > memory.memsw.limit_in_bytes
+	run 51M of malloc
diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index e1501964df1..a98a7fe7aab 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -302,7 +302,7 @@ will be charged as a new owner of it.
 	unevictable		- # of pages cannot be reclaimed.(mlocked etc)
 
 	Below is depend on CONFIG_DEBUG_VM.
-	inactive_ratio		- VM inernal parameter. (see mm/page_alloc.c)
+	inactive_ratio		- VM internal parameter. (see mm/page_alloc.c)
 	recent_rotated_anon	- VM internal parameter. (see mm/vmscan.c)
 	recent_rotated_file	- VM internal parameter. (see mm/vmscan.c)
 	recent_scanned_anon 	- VM internal parameter. (see mm/vmscan.c)
author	Len Brown <len.brown@intel.com>	2009-04-05 02:14:15 -0400
committer	Len Brown <len.brown@intel.com>	2009-04-05 02:14:15 -0400
commit	478c6a43fcbc6c11609f8cee7c7b57223907754f (patch)
tree	a7f7952099da60d33032aed6de9c0c56c9f8779e /Documentation/cgroups
parent	8a3f257c704e02aee9869decd069a806b45be3f1 (diff)
parent	6bb597507f9839b13498781e481f5458aea33620 (diff)
download	linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.tar.gz linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.tar.bz2 linux-3.10-478c6a43fcbc6c11609f8cee7c7b57223907754f.zip