summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-04-01cpufreq: Fix serialization of frequency transitionsViresh Kumar2-4/+5
Commit 7c30ed ("cpufreq: make sure frequency transitions are serialized") interacts poorly with systems that have a single core freqency for all cores. On such systems we have a single policy for all cores with several CPUs. When we do a frequency transition the governor calls the pre and post change notifiers which causes cpufreq_notify_transition() per CPU. Since the policy is the same for all of them all CPUs after the first and the warnings added are generated by checking a per-policy flag the warnings will be triggered for all cores after the first. Fix this by allowing notifier to be called for n times. Where n is the number of cpus in policy->cpus. Change-Id: I5712dde7f992644f9c3ddc8313151f80bea0d877 Reported-and-tested-by: Mark Brown <broonie@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-01cpufreq: make sure frequency transitions are serializedViresh Kumar2-0/+15
Whenever we are changing frequency of a cpu, we are calling PRECHANGE and POSTCHANGE notifiers. They must be serialized. i.e. PRECHANGE or POSTCHANGE shouldn't be called twice contiguously. This can happen due to bugs in users of __cpufreq_driver_target() or actual cpufreq drivers who are sending these notifiers. This patch adds some protection against this. Now, we keep track of the last transaction and see if something went wrong. Change-Id: I0f5465bd515c431ae2d3711d065f70aacec7e978 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-01cpufreq: remove unnecessary cpufreq_cpu_{get|put}() callsViresh Kumar1-17/+2
struct cpufreq_policy is already passed as argument to some routines like: __cpufreq_driver_getavg() and so we don't really need to do cpufreq_cpu_get() before and cpufreq_cpu_put() in them to get a policy structure. Remove them. Change-Id: I6a9ff8ed483a4f4faacc2ea047d93354dccdb0b6 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-04-01cpufreq: Fix governor start/stop race conditionXiaoguang Chen2-0/+25
Cpufreq governors' stop and start operations should be carried out in sequence. Otherwise, there will be unexpected behavior, like in the example below. Suppose there are 4 CPUs and policy->cpu=CPU0, CPU1/2/3 are linked to CPU0. The normal sequence is: 1) Current governor is userspace. An application tries to set the governor to ondemand. It will call __cpufreq_set_policy() in which it will stop the userspace governor and then start the ondemand governor. 2) Current governor is userspace. The online of CPU3 runs on CPU0. It will call cpufreq_add_policy_cpu() in which it will first stop the userspace governor, and then start it again. If the sequence of the above two cases interleaves, it becomes: 1) Application stops userspace governor 2) Hotplug stops userspace governor which is a problem, because the governor shouldn't be stopped twice in a row. What happens next is: 3) Application starts ondemand governor 4) Hotplug starts a governor In step 4, the hotplug is supposed to start the userspace governor, but now the governor has been changed by the application to ondemand, so the ondemand governor is started once again, which is incorrect. The solution is to prevent policy governors from being stopped multiple times in a row. A governor should only be stopped once for one policy. After it has been stopped, no more governor stop operations should be executed. Also add a mutex to serialize governor operations. Change-Id: Ie380dc7c551f2721b81ceb8e4849efa09345ce4b [rjw: Changelog. And you owe me a beverage of my choice.] Signed-off-by: Xiaoguang Chen <chenxg@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-26ASoC: max98090: Add of_match_tableChenZhen1-0/+7
It's necessary to add of_match_table to match driver. Change-Id: I7c09aef9a3f54180041398009b9141142de54ea4 Signed-off-by: ChenZhen <zhen1.chen@samsung.com>
2014-03-24ARM: dts: odroidx2: add i2c1 node for max98090ChenZhen1-0/+12
Add i2c1 node for the codec driver max98090. Change-Id: Ib6afaf7574827540281959a1f8338d50e221df39 Signed-off-by: ChenZhen <zhen1.chen@samsung.com>
2014-03-25packaging: specify ExclusiveArch to arm/aarch64submit/tizen_boot/20140325.114317Chanho Park1-0/+1
This patch specifies "ExclusiveArch" for building only arm and aarch64. Change-Id: I33a484b478d7848257a4ea8b4375b0ea1994c47e Signed-off-by: Chanho Park <chanho61.park@samsung.com>
2014-03-24packaging: change modules directory to /boot/lib/modulessubmit/tizen_boot/20140324.052346Chanho Park1-10/+10
This patch changes the default modules directory from /lib/modules to /boot/lib/modules. The mobile kernel didn't use /lib/modules for modules because it should be matched with kernel version and can be recoveried with the /boot directory. The old kernel used modules.img which includes kernel modules. And we also loop-mounted it to the /lib/modules directory. Instead of it, we'll use /boot/lib/ modules directory because it will have same functionality if the /boot directory will be read-only. And we will add the recovery partition of the /boot. Change-Id: Ie0f0af47f0f6d3fe25c780fb8685df745b587dd7 Signed-off-by: Chanho Park <chanho61.park@samsung.com>
2014-03-24packaging: update kernel version to 3.10.33Chanho Park1-1/+1
Recently, we updated the kernel version from 3.10.19 to 3.10.33. This patch also updates the kernel version in the spec file. Change-Id: Ifc5239ea428e2178aa70ddd3b94364fdbb7ebe79 Signed-off-by: Chanho Park <chanho61.park@samsung.com>
2014-03-21vmpressure: make sure there are no events queued after memcg is offlinedMichal Hocko3-0/+18
vmpressure is called synchronously from reclaim where the target_memcg is guaranteed to be alive but the eventfd is signaled from the work queue context. This means that memcg (along with vmpressure structure which is embedded into it) might go away while the work item is pending which would result in use-after-release bug. We have two possible ways how to fix this. Either vmpressure pins memcg before it schedules vmpr->work and unpin it in vmpressure_work_fn or explicitely flush the work item from the css_offline context (as suggested by Tejun). This patch implements the later one and it introduces vmpressure_cleanup which flushes the vmpressure work queue item item. It hooks into mem_cgroup_css_offline after the memcg itself is cleaned up. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Li Zefan <lizefan@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I1deefca16b6e243f86bd78b84c561db02e7a20e8
2014-03-21vmpressure: do not check for pending work to prevent from new workMichal Hocko1-1/+1
because it is racy and it doesn't give us much anyway as schedule_work handles this case already. Change-Id: I9946652da98eef2ed0312a5470e69db13fab0e4c Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Li Zefan <lizefan@huawei.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-21vmpressure: change vmpressure::sr_lock to spinlockMichal Hocko2-6/+6
There is nothing that can sleep inside critical sections protected by this lock and those sections are really small so there doesn't make much sense to use mutex for them. Change the log to a spinlock Change-Id: I54c8361a88ec810676cf631f3754c5b860d54b01 Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Tejun Heo <tj@kernel.org> Cc: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Li Zefan <lizefan@huawei.com> Reviewed-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-21tizen: enable zswap/zsmalloc/zbud to compress swap memoryChanho Park1-5/+9
Change-Id: Iccf427f0acecad261b7cd8baa7114ecf9a421914 Signed-off-by: Chanho Park <chanho61.park@samsung.com>
2014-03-20modem: sipc4: Chagne the manner of recieving data for FMT,RFS type deviceJonghwa Lee1-1/+25
When packet arrives, link device call iodev's helper function to recieve packets. The way of recieving data of IPC_FMT and IPC_RFS type iodevs differs from IPC_RAW and IPC_MULTI_RAW. This patch adds specified method of recieving data for FMT, RFS typed. This modification references TIZEN 2.2 kernel. Change-Id: I01efa7678bbabfbd1011ceba42571fc221313c4d Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
2014-03-20drm/exynos: remove DRIVER_HAVE_IRQ featureJoonyoung Shim3-22/+1
Exynos drm driver cannot support DRIVER_HAVE_IRQ feature because it uses driver specific one instead of routine of drm framework to install/uninstall irq handler. Change-Id: I5796d7113cbc4283cbb41591384aaa69011818d4 Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
2014-03-20rbtree: fix rbtree_postorder_for_each_entry_safe() iteratorJan Kara1-7/+9
The iterator rbtree_postorder_for_each_entry_safe() relies on pointer underflow behavior when testing for loop termination. In particular it expects that &rb_entry(NULL, type, field)->field is NULL. But the result of this expression is not defined by a C standard and some gcc versions (e.g. 4.3.4) assume the above expression can never be equal to NULL. The net result is an oops because the iteration is not properly terminated. Fix the problem by modifying the iterator to avoid pointer underflows. Change-Id: I06d5983b5335412be6cb6ebd95db3c682e26ed38 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michel Lespinasse <walken@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <stable@vger.kernel.org> [3.12.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20rbtree: add rbtree_postorder_for_each_entry_safe() helperCody P Schafer1-0/+18
Because deletion (of the entire tree) is a relatively common use of the rbtree_postorder iteration, and because doing it safely means fiddling with temporary storage, provide a helper to simplify postorder rbtree iteration. Change-Id: I8442bc3efc79dca08bfbc6ebb63607cf4e83bcf6 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20rbtree: add postorder iteration functionsCody P Schafer2-0/+44
Postorder iteration yields all of a node's children prior to yielding the node itself, and this particular implementation also avoids examining the leaf links in a node after that node has been yielded. In what I expect will be its most common usage, postorder iteration allows the deletion of every node in an rbtree without modifying the rbtree nodes (no _requirement_ that they be nulled) while avoiding referencing child nodes after they have been "deleted" (most commonly, freed). I have only updated zswap to use this functionality at this point, but numerous bits of code (most notably in the filesystem drivers) use a hand rolled postorder iteration that NULLs child links as it traverses the tree. Each of those instances could be replaced with this common implementation. 1 & 2 add rbtree postorder iteration functions. 3 adds testing of the iteration to the rbtree runtime tests 4 allows building the rbtree runtime tests as builtins 5 updates zswap. This patch: Add postorder iteration functions for rbtree. These are useful for safely freeing an entire rbtree without modifying the tree at all. Change-Id: Ibc97f0e13288030501b5e84defc6603eeb1adca6 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap.c: change params from hidden to roDan Streetman1-2/+2
The "compressor" and "enabled" params are currently hidden, this changes them to read-only, so userspace can tell if zswap is enabled or not and see what compressor is in use. Change-Id: I42d8ac2544ccbf981de26d98b772417e183360f6 Signed-off-by: Dan Streetman <ddstreet@ieee.org> Cc: Vladimir Murzin <murzin.v@gmail.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Weijie Yang <weijie.yang@samsung.com> Acked-by: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap: refactor the get/put routinesWeijie Yang1-94/+88
The refcount routine was not fit the kernel get/put semantic exactly, There were too many judgement statements on refcount and it could be minus. This patch does the following: - move refcount judgement to zswap_entry_put() to hide resource free function. - add a new function zswap_entry_find_get(), so that callers can use easily in the following pattern: zswap_entry_find_get .../* do something */ zswap_entry_put - to eliminate compile error, move some functions declaration This patch is based on Minchan Kim <minchan@kernel.org> 's idea and suggestion. Change-Id: I8510ffe4f49a1a5f00b53be89b2ee33854464db8 Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Cc: Seth Jennings <sjennings@variantweb.net> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrentlyWeijie Yang1-8/+14
Consider the following scenario: thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page) thread 1: call zswap_frontswap_invalidate_page to invalidate entry x. finished, entry x and its zbud is not freed as its refcount != 0 now, the swap_map[x] = 0 thread 0: now call zswap_get_swap_cache_page swapcache_prepare return -ENOENT because entry x is not used any more zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM zswap_writeback_entry do nothing except put refcount Now, the memory of zswap_entry x and its zpage leak. Modify: - check the refcount in fail path, free memory if it is not referenced. - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path can be not only caused by nomem but also by invalidate. Change-Id: I3d76f21a11f2d9ff0bec412c90b3895efce6478d Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap: avoid unnecessary page scanningWeijie Yang1-0/+3
Add SetPageReclaim() before __swap_writepage() so that page can be moved to the tail of the inactive list, which can avoid unnecessary page scanning as this page was reclaimed by swap subsystem before. Change-Id: If1ed52e3161c332d9f1f6fdd8851e97b5d3b4271 Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap: bugfix: memory leak when re-swaponWeijie Yang1-0/+4
zswap_tree is not freed when swapoff, and it got re-kmalloced in swapon, so a memory leak occurs. Free the memory of zswap_tree in zswap_frontswap_invalidate_area(). Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Reviewed-by: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> From: Weijie Yang <weijie.yang@samsung.com> Subject: mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently Consider the following scenario: thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page) thread 1: call zswap_frontswap_invalidate_page to invalidate entry x. finished, entry x and its zbud is not freed as its refcount != 0 now, the swap_map[x] = 0 thread 0: now call zswap_get_swap_cache_page swapcache_prepare return -ENOENT because entry x is not used any more zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM zswap_writeback_entry do nothing except put refcount Now, the memory of zswap_entry x and its zpage leak. Modify: - check the refcount in fail path, free memory if it is not referenced. - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path can be not only caused by nomem but also by invalidate. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Cc: Minchan Kim <minchan@kernel.org> Cc: <stable@vger.kernel.org> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Change-Id: I4a875e48714d73bf2c1f75b60d90776365c047de Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap: use postorder iteration when destroying rbtreeCody P Schafer1-14/+2
Change-Id: I83b93b7eaadb7c66981f1119eda1119c978d1b9c Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zbud: fix some trivial typos in commentsJianguo Wu1-2/+2
Change-Id: I1acb8c1f4ff9ab8dbd698380a731daef51d028fc Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm/zswap.c: get swapper address_space by using macroSunghan Suh1-1/+1
There is a proper macro to get the corresponding swapper address space from a swap entry. Instead of directly accessing "swapper_spaces" array, use the "swap_address_space" macro. Change-Id: I145f9a3fad914ff83853cd80c60af61f40eab1cf Signed-off-by: Sunghan Suh <sunghan.suh@samsung.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20mm: zbud: fix condition check on allocation sizeHeesub Shin1-1/+1
zbud_alloc() incorrectly verifies the size of allocation limit. It should deny the allocation request greater than (PAGE_SIZE - ZHDR_SIZE_ALIGNED - CHUNK_SIZE), not (PAGE_SIZE - ZHDR_SIZE_ALIGNED) which has no remaining spaces for its buddy. There is no point in spending the entire zbud page storing only a single page, since we don't have any benefits. Change-Id: Ief305088b6983c01426300a0638520f51b17ad2a Signed-off-by: Heesub Shin <heesub.shin@samsung.com> Acked-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Dongjun Shin <d.j.shin@samsung.com> Cc: Sunae Seo <sunae.seo@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: remove zram->lock in read path and change it with mutexMinchan Kim2-12/+9
Finally, we separated zram->lock dependency from 32bit stat/ table handling so there is no reason to use rw_semaphore between read and write path so this patch removes the lock from read path totally and changes rw_semaphore with mutex. So, we could do old: read-read: OK read-write: NO write-write: NO Now: read-read: OK read-write: OK write-write: NO The below data proves mixed workload performs well 11 times and there is also enhance on write-write path because current rw-semaphore doesn't support SPIN_ON_OWNER. It's side effect but anyway good thing for us. Write-related tests perform better (from 61% to 1058%) but read path has good/bad(from -2.22% to 1.45%) but they are all marginal within stddev. CPU 12 iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 10 records: 10 avg: 516189.16 avg: 839907.96 std: 22486.53 (4.36%) std: 47902.17 (5.70%) max: 546970.60 max: 909910.35 min: 481131.54 min: 751148.38 ==Rewrite ==Rewrite records: 10 records: 10 avg: 509527.98 avg: 1050156.37 std: 45799.94 (8.99%) std: 40695.44 (3.88%) max: 611574.27 max: 1111929.26 min: 443679.95 min: 980409.62 ==Read ==Read records: 10 records: 10 avg: 4408624.17 avg: 4472546.76 std: 281152.61 (6.38%) std: 163662.78 (3.66%) max: 4867888.66 max: 4727351.03 min: 4058347.69 min: 4126520.88 ==Re-read ==Re-read records: 10 records: 10 avg: 4462147.53 avg: 4363257.75 std: 283546.11 (6.35%) std: 247292.63 (5.67%) max: 4912894.44 max: 4677241.75 min: 4131386.50 min: 4035235.84 ==Reverse Read ==Reverse Read records: 10 records: 10 avg: 4565865.97 avg: 4485818.08 std: 313395.63 (6.86%) std: 248470.10 (5.54%) max: 5232749.16 max: 4789749.94 min: 4185809.62 min: 3963081.34 ==Stride read ==Stride read records: 10 records: 10 avg: 4515981.80 avg: 4418806.01 std: 211192.32 (4.68%) std: 212837.97 (4.82%) max: 4889287.28 max: 4686967.22 min: 4210362.00 min: 4083041.84 ==Random read ==Random read records: 10 records: 10 avg: 4410525.23 avg: 4387093.18 std: 236693.22 (5.37%) std: 235285.23 (5.36%) max: 4713698.47 max: 4669760.62 min: 4057163.62 min: 3952002.16 ==Mixed workload ==Mixed workload records: 10 records: 10 avg: 243234.25 avg: 2818677.27 std: 28505.07 (11.72%) std: 195569.70 (6.94%) max: 288905.23 max: 3126478.11 min: 212473.16 min: 2484150.69 ==Random write ==Random write records: 10 records: 10 avg: 555887.07 avg: 1053057.79 std: 70841.98 (12.74%) std: 35195.36 (3.34%) max: 683188.28 max: 1096125.73 min: 437299.57 min: 992481.93 ==Pwrite ==Pwrite records: 10 records: 10 avg: 501745.93 avg: 810363.09 std: 16373.54 (3.26%) std: 19245.01 (2.37%) max: 518724.52 max: 833359.70 min: 464208.73 min: 765501.87 ==Pread ==Pread records: 10 records: 10 avg: 4539894.60 avg: 4457680.58 std: 197094.66 (4.34%) std: 188965.60 (4.24%) max: 4877170.38 max: 4689905.53 min: 4226326.03 min: 4095739.72 Change-Id: I7d2299149ce6982d76caaaadb936b7385cbee515 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: remove workqueue for freeing removed pending slotMinchan Kim2-58/+6
Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity") introduced free request pending code to avoid scheduling by mutex under spinlock and it was a mess which made code lenghty and increased overhead. Now, we don't need zram->lock any more to free slot so this patch reverts it and then, tb_lock should protect it. Change-Id: I3429e568bab78c197da3fc5cbd5afb9355bf7d21 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: introduce zram->tb_lockMinchan Kim2-6/+23
Currently, the zram table is protected by zram->lock but it's rather coarse-grained lock and it makes hard for scalibility. Let's use own rwlock instead of depending on zram->lock. This patch adds new locking so obviously, it would make slow but this patch is just prepartion for removing coarse-grained rw_semaphore(ie, zram->lock) which is hurdle about zram scalability. Final patch in this patchset series will remove the lock from read-path and change rw_semaphore with mutex in write path. With bonus, we could drop pending slot free mess in next patch. Change-Id: If5456f871bc6b0d6ee1f8218fde3f5a13d261c8b Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: use atomic operation for statMinchan Kim2-20/+16
Some of fields in zram->stats are protected by zram->lock which is rather coarse-grained so let's use atomic operation without explict locking. This patch is ready for removing dependency of zram->lock in read path which is very coarse-grained rw_semaphore. Of course, this patch adds new atomic operation so it might make slow but my 12CPU test couldn't spot any regression. All gain/lose is marginal within stddev. iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0 ==Initial write ==Initial write records: 50 records: 50 avg: 412875.17 avg: 415638.23 std: 38543.12 (9.34%) std: 36601.11 (8.81%) max: 521262.03 max: 502976.72 min: 343263.13 min: 351389.12 ==Rewrite ==Rewrite records: 50 records: 50 avg: 416640.34 avg: 397914.33 std: 60798.92 (14.59%) std: 46150.42 (11.60%) max: 543057.07 max: 522669.17 min: 304071.67 min: 316588.77 ==Read ==Read records: 50 records: 50 avg: 4147338.63 avg: 4070736.51 std: 179333.25 (4.32%) std: 223499.89 (5.49%) max: 4459295.28 max: 4539514.44 min: 3753057.53 min: 3444686.31 ==Re-read ==Re-read records: 50 records: 50 avg: 4096706.71 avg: 4117218.57 std: 229735.04 (5.61%) std: 171676.25 (4.17%) max: 4430012.09 max: 4459263.94 min: 2987217.80 min: 3666904.28 ==Reverse Read ==Reverse Read records: 50 records: 50 avg: 4062763.83 avg: 4078508.32 std: 186208.46 (4.58%) std: 172684.34 (4.23%) max: 4401358.78 max: 4424757.22 min: 3381625.00 min: 3679359.94 ==Stride read ==Stride read records: 50 records: 50 avg: 4094933.49 avg: 4082170.22 std: 185710.52 (4.54%) std: 196346.68 (4.81%) max: 4478241.25 max: 4460060.97 min: 3732593.23 min: 3584125.78 ==Random read ==Random read records: 50 records: 50 avg: 4031070.04 avg: 4074847.49 std: 192065.51 (4.76%) std: 206911.33 (5.08%) max: 4356931.16 max: 4399442.56 min: 3481619.62 min: 3548372.44 ==Mixed workload ==Mixed workload records: 50 records: 50 avg: 149925.73 avg: 149675.54 std: 7701.26 (5.14%) std: 6902.09 (4.61%) max: 191301.56 max: 175162.05 min: 133566.28 min: 137762.87 ==Random write ==Random write records: 50 records: 50 avg: 404050.11 avg: 393021.47 std: 58887.57 (14.57%) std: 42813.70 (10.89%) max: 601798.09 max: 524533.43 min: 325176.99 min: 313255.34 ==Pwrite ==Pwrite records: 50 records: 50 avg: 411217.70 avg: 411237.96 std: 43114.99 (10.48%) std: 33136.29 (8.06%) max: 530766.79 max: 471899.76 min: 320786.84 min: 317906.94 ==Pread ==Pread records: 50 records: 50 avg: 4154908.65 avg: 4087121.92 std: 151272.08 (3.64%) std: 219505.04 (5.37%) max: 4459478.12 max: 4435857.38 min: 3730512.41 min: 3101101.67 Change-Id: Ib0d538597fbc4a2037b0464f8d62fb73fa0b0c24 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: remove unnecessary freeMinchan Kim1-8/+0
Commit a0c516cbfc74 ("zram: don't grab mutex in zram_slot_free_noity") introduced pending zram slot free in zram's write path in case of missing slot free by memory allocation failure in zram_slot_free_notify but it is not necessary because we have already freed the slot right before overwriting. Change-Id: I5048bce2ca8c377d9539f0397a04bddc5f5a5e92 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: delay pending free request in read pathMinchan Kim1-1/+0
Sergey reported we don't need to handle pending free request every I/O so that this patch removes it in read path while we remain it in write path. Let's consider below example. Swap subsystem ask to zram "A" block free by swap_slot_free_notify but zram had been pended it without real freeing. Swap subsystem allocates "A" block for new data but request pended for a long time just handled and zram blindly free new data on the "A" block. :( That's why we couldn't remove handle pending free request right before zram-write. Change-Id: Ib4409bfad7b1ae263e2708c74875c322da72c7b3 Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: fix race between reset and flushing pending workMinchan Kim1-2/+2
Dan and Sergey reported that there is a racy between reset and flushing of pending work so that it could make oops by freeing zram->meta in reset while zram_slot_free can access zram->meta if new request is adding during the race window. This patch moves flush after taking init_lock so it prevents new request so that it closes the race. Change-Id: Ibc09001d1ad4a4ef852d661384259b53f0f9c19b Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchan@redhat.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zsmalloc: add maintainersMinchan Kim1-0/+14
tAdd adds maintainer information for zsmalloc into the MAINTAINERS file. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: MAINTAINERS Change-Id: I315effbbfbd5aabb96dcadd1d35d9592ee24182d
2014-03-20zram: add zram maintainersMinchan Kim1-0/+8
Add maintainer information for zram into the MAINTAINERS file. Change-Id: I8a6b11120f55b76aeccf18ce004293721e48081a Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zsmalloc: add copyrightMinchan Kim2-0/+2
Add my copyright to the zsmalloc source code which I maintain. Change-Id: Ic3dd8dd11297ef902f4cb913e40c52249282d947 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: add copyrightMinchan Kim2-0/+2
Add my copyright to the zram source code which I maintain. Change-Id: I8816064aa958c9304c53fae0972e011060cc2bcc Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: remove old private project commentMinchan Kim4-9/+0
Remove the old private compcache project address so upcoming patches should be sent to LKML because we Linux kernel community will take care. Change-Id: Ia5bf208791c8fa6e96161fd9fb842d6829f14698 Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-03-20zram: promote zram from stagingMinchan Kim9-3/+3
Zram has lived in staging for a LONG LONG time and have been fixed/improved by many contributors so code is clean and stable now. Of course, there are lots of product using zram in real practice. The major TV companys have used zram as swap since two years ago and recently our production team released android smart phone with zram which is used as swap, too and recently Android Kitkat start to use zram for small memory smart phone. And there was a report Google released their ChromeOS with zram, too and cyanogenmod have been used zram long time ago. And I heard some disto have used zram block device for tmpfs. In addition, I saw many report from many other peoples. For example, Lubuntu start to use it. The benefit of zram is very clear. With my experience, one of the benefit was to remove jitter of video application with backgroud memory pressure. It would be effect of efficient memory usage by compression but more issue is whether swap is there or not in the system. Recent mobile platforms have used JAVA so there are many anonymous pages. But embedded system normally are reluctant to use eMMC or SDCard as swap because there is wear-leveling and latency issues so if we do not use swap, it means we can't reclaim anoymous pages and at last, we could encounter OOM kill. :( Although we have real storage as swap, it was a problem, too. Because it sometime ends up making system very unresponsible caused by slow swap storage performance. Quote from Luigi on Google "Since Chrome OS was mentioned: the main reason why we don't use swap to a disk (rotating or SSD) is because it doesn't degrade gracefully and leads to a bad interactive experience. Generally we prefer to manage RAM at a higher level, by transparently killing and restarting processes. But we noticed that zram is fast enough to be competitive with the latter, and it lets us make more efficient use of the available RAM. " and he announced. http://www.spinics.net/lists/linux-mm/msg57717.html Other uses case is to use zram for block device. Zram is block device so anyone can format the block device and mount on it so some guys on the internet start zram as /var/tmp. http://forums.gentoo.org/viewtopic-t-838198-start-0.html Let's promote zram and enhance/maintain it instead of removing. Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Nitin Gupta <ngupta@vflare.org> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: drivers/block/Makefile Change-Id: I368f76a5368fffaacbf349cfd78f79cba5da0a0d
2014-03-20zsmalloc: move it under mmMinchan Kim9-34/+28
This patch moves zsmalloc under mm directory. Before that, description will explain why we have needed custom allocator. Zsmalloc is a new slab-based memory allocator for storing compressed pages. It is designed for low fragmentation and high allocation success rate on large object, but <= PAGE_SIZE allocations. zsmalloc differs from the kernel slab allocator in two primary ways to achieve these design goals. zsmalloc never requires high order page allocations to back slabs, or "size classes" in zsmalloc terms. Instead it allows multiple single-order pages to be stitched together into a "zspage" which backs the slab. This allows for higher allocation success rate under memory pressure. Also, zsmalloc allows objects to span page boundaries within the zspage. This allows for lower fragmentation than could be had with the kernel slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE. With the kernel slab allocator, if a page compresses to 60% of it original size, the memory savings gained through compression is lost in fragmentation because another object of the same size can't be stored in the leftover space. This ability to span pages results in zsmalloc allocations not being directly addressable by the user. The user is given an non-dereferencable handle in response to an allocation request. That handle must be mapped, using zs_map_object(), which returns a pointer to the mapped region that can be used. The mapping is necessary since the object data may reside in two different noncontigious pages. The zsmalloc fulfills the allocation needs for zram perfectly [sjenning@linux.vnet.ibm.com: borrow Seth's quote] Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bob Liu <bob.liu@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Luigi Semenzato <semenzato@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Pekka Enberg <penberg@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: mm/Kconfig Change-Id: I57dad090a3c48db4a67c88e6fa20a4bdbb82d984
2014-03-20zsmalloc: add more commentNitin Cupta2-11/+64
This patch adds lots of comments and it will help others to review and enhance. Change-Id: I743596bf18e8acf1082c21437c2caef5f15aad71 Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Signed-off-by: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20zsmalloc: add Kconfig for enabling page table methodMinchan Kim2-15/+17
Zsmalloc has two methods 1) copy-based and 2) pte based to access objects that span two pages. You can see history why we supported two approach from [1]. But it was bad choice that adding hard coding to select arch which want to use pte based method because there are lots of SoC in an architecure and they can have different cache size, CPU speed and so on so it would be better to expose it to user as selectable Kconfig option like Andrew Morton suggested. [1] https://lkml.org/lkml/2012/7/11/58 Change-Id: Ieedde9cfac0a7d9bbcb3d5d5b36318efd41132eb Acked-by: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20Staging: zram: Fix memory leak by refcount mismatchRashika Kheria1-5/+14
As suggested by Minchan Kim and Jerome Marchand "The code in reset_store get the block device (bdget_disk()) but it does not put it (bdput()) when it's done using it. The usage count is therefore incremented but never decremented." This patch also puts bdput() for all error cases. Change-Id: I92198df5ff42242ef3627e5d3db4acece7940d61 Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20Staging: zram: Fix access of NULL pointerRashika Kheria1-2/+4
This patch fixes the bug in reset_store caused by accessing NULL pointer. The bdev gets its value from bdget_disk() which could fail when memory pressure is severe and hence can return NULL because allocation of inode in bdget could fail. Hence, this patch introduces a check for bdev to prevent reference to a NULL pointer in the later part of the code. It also removes unnecessary check of bdev for fsync_bdev(). Change-Id: I47cdcc08076df7958a19406b8502627802c7bd07 Cc: stable <stable@vger.kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20Staging: zram: Fix variable dereferenced before checkRashika Kheria1-6/+3
This patch fixes the following Smatch warning in zram_drv.c- drivers/staging/zram/zram_drv.c:899 destroy_device() warn: variable dereferenced before check 'zram->disk' (see line 896) Change-Id: If920cb9e1328289c561de89e074523071cd772b5 Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20zsmalloc: Fix "map_vm_area" undefined reference errors.Majunath Goudar1-0/+1
This patch adds a MMU dependency to configure the ZSMALLOC in drivers/staging/zsmalloc/Kconfig. Without this patch, build system can lead to build failure. This was observed during randconfig testing, in which ZSMALLOC was enabled w/o MMU being enabled. Following was the error: LD vmlinux drivers/built-in.o: In function `__zs_map_object': drivers/staging/zsmalloc/zsmalloc-main.c:650: undefined reference to `map_vm_area' make: *** [vmlinux] Error 1 Change-Id: Ia78fb6b91949e85f3b9fee18eaffa18205aaccb9 Signed-off-by: Manjunath Goudar <csmanjuvijay@gmail.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: devel@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20Revert "staging: zram: Add auto loading of module if user opens /dev/zram."Greg Kroah-Hartman1-1/+0
This reverts commit c70bda992c12e593e411c02a52e4bd6985407539. It's incorrect, Kay writes: Please just remove it. "devname" is meant to be used for single-instance devices with a static dev_t, never for things like zramX. It will not do anything useful here, it does nothing really without a statically assigned dev_t, and it should not be used for devices of this kind anyway. Change-Id: Ia1503b3cff95e5dd31c934f420025ea252f09129 Reported-by: Tom Gundersen <teg@jklm.no> Reported-by: Kay Sievers <kay@vrfy.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20zram: don't grab mutex in zram_slot_free_noityMinchan Kim2-3/+67
[1] introduced down_write in zram_slot_free_notify to prevent race between zram_slot_free_notify and zram_bvec_[read|write]. The race could happen if somebody who has right permission to open swap device is reading swap device while it is used by swap in parallel. However, zram_slot_free_notify is called with holding spin_lock of swap layer so we shouldn't avoid holing mutex. Otherwise, lockdep warns it. This patch adds new list to handle free slot and workqueue so zram_slot_free_notify just registers slot index to be freed and registers the request to workqueue. If workqueue is expired, it holds mutex_lock so there is no problem any more. If any I/O is issued, zram handles pending slot-free request caused by zram_slot_free_notify right before handling issued request because workqueue wouldn't be expired yet so zram I/O request handling function can miss it. Lastly, when zram is reset, flush_work could handle all of pending free request so we shouldn't have memory leak. NOTE: If zram_slot_free_notify's kmalloc with GFP_ATOMIC would be failed, the slot will be freed when next write I/O write the slot. [1] [57ab0485, zram: use zram->lock to protect zram_free_page() in swap free notify path] * from v2 * refactoring * from v1 * totally redesign Change-Id: I7f478fa9b00215eb0a3e16433e607a2f73151f27 Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-03-20zram: fix invalid memory accessMinchan Kim1-6/+9
[1] tried to fix invalid memory access on zram->disk but it didn't fix properly because get_disk failed during module exit path. Actually, we don't need to reset zram->disk's capacity to zero in module exit path so that this patch introduces new argument "reset_capacity" on zram_reset_divice and it only reset it when reset_store is called. [1] 6030ea9b, zram: avoid invalid memory access in zram_exit() Change-Id: Ibc554559bbd533bd986c6001ffb9cfb22f8c9f49 Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>