summaryrefslogtreecommitdiff
path: root/init
AgeCommit message (Collapse)AuthorFilesLines
2012-01-17Merge branch 'for-linus' of ↵Linus Torvalds1-1/+15
git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits) audit: no leading space in audit_log_d_path prefix audit: treat s_id as an untrusted string audit: fix signedness bug in audit_log_execve_info() audit: comparison on interprocess fields audit: implement all object interfield comparisons audit: allow interfield comparison between gid and ogid audit: complex interfield comparison helper audit: allow interfield comparison in audit rules Kernel: Audit Support For The ARM Platform audit: do not call audit_getname on error audit: only allow tasks to set their loginuid if it is -1 audit: remove task argument to audit_set_loginuid audit: allow audit matching on inode gid audit: allow matching on obj_uid audit: remove audit_finish_fork as it can't be called audit: reject entry,always rules audit: inline audit_free to simplify the look of generic code audit: drop audit_set_macxattr as it doesn't do anything audit: inline checks for not needing to collect aux records audit: drop some potentially inadvisable likely notations ... Use evil merge to fix up grammar mistakes in Kconfig file. Bad speling and horrible grammar (and copious swearing) is to be expected, but let's keep it to commit messages and comments, rather than expose it to users in config help texts or printouts.
2012-01-17Kernel: Audit Support For The ARM PlatformNathaniel Husted1-1/+1
This patch provides functionality to audit system call events on the ARM platform. The implementation was based off the structure of the MIPS platform and information in this (http://lists.fedoraproject.org/pipermail/arm/2009-October/000382.html) mailing list thread. The required audit_syscall_exit and audit_syscall_entry checks were added to ptrace using the standard registers for system call values (r0 through r3). A thread information flag was added for auditing (TIF_SYSCALL_AUDIT) and a meta-flag was added (_TIF_SYSCALL_WORK) to simplify modifications to the syscall entry/exit. Now, if either the TRACE flag is set or the AUDIT flag is set, the syscall_trace function will be executed. The prober changes were made to Kconfig to allow CONFIG_AUDITSYSCALL to be enabled. Due to platform availability limitations, this patch was only tested on the Android platform running the modified "android-goldfish-2.6.29" kernel. A test compile was performed using Code Sourcery's cross-compilation toolset and the current linux-3.0 stable kernel. The changes compile without error. I'm hoping, due to the simple modifications, the patch is "obviously correct". Signed-off-by: Nathaniel Husted <nhusted@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-17audit: only allow tasks to set their loginuid if it is -1Eric Paris1-0/+14
At the moment we allow tasks to set their loginuid if they have CAP_AUDIT_CONTROL. In reality we want tasks to set the loginuid when they log in and it be impossible to ever reset. We had to make it mutable even after it was once set (with the CAP) because on update and admin might have to restart sshd. Now sshd would get his loginuid and the next user which logged in using ssh would not be able to set his loginuid. Systemd has changed how userspace works and allowed us to make the kernel work the way it should. With systemd users (even admins) are not supposed to restart services directly. The system will restart the service for them. Thus since systemd is going to loginuid==-1, sshd would get -1, and sshd would be allowed to set a new loginuid without special permissions. If an admin in this system were to manually start an sshd he is inserting himself into the system chain of trust and thus, logically, it's his loginuid that should be used! Since we have old systems I make this a Kconfig option. Signed-off-by: Eric Paris <eparis@redhat.com>
2012-01-14Merge tag 'for-linus' of git://github.com/rustyrussell/linuxLinus Torvalds1-1/+1
Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 * tag 'for-linus' of git://github.com/rustyrussell/linux: module_param: check that bool parameters really are bool. intelfbdrv.c: bailearly is an int module_param paride/pcd: fix bool verbose module parameter. module_param: make bool parameters really bool (drivers & misc) module_param: make bool parameters really bool (arch) module_param: make bool parameters really bool (core code) kernel/async: remove redundant declaration. printk: fix unnecessary module_param_name. lirc_parallel: fix module parameter description. module_param: avoid bool abuse, add bint for special cases. module_param: check type correctness for module_param_array modpost: use linker section to generate table. modpost: use a table rather than a giant if/else statement. modules: sysfs - export: taint, coresize, initsize kernel/params: replace DEBUGP with pr_debug module: replace DEBUGP with pr_debug module: struct module_ref should contains long fields module: Fix performance regression on modules with large symbol tables module: Add comments describing how the "strmap" logic works Fix up conflicts in scripts/mod/file2alias.c due to the new linker- generated table approach to adding __mod_*_device_table entries. The ARM sa11x0 mcp bus needed to be converted to that too.
2012-01-12c/r: introduce CHECKPOINT_RESTORE symbolCyrill Gorcunov1-0/+11
For checkpoint/restore we need auxilary features being compiled into the kernel, such as additional prctl codes, /proc/<pid>/map_files and etc... but same time these features are not mandatory for a regular kernel so CHECKPOINT_RESTORE config symbol should bring a way to disable them all at once if one wish to get rid of additional functionality. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-13module_param: make bool parameters really bool (core code)Rusty Russell1-1/+1
module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2012-01-11Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix lockup by limiting load-balance retries on lock-break sched: Fix CONFIG_CGROUP_SCHED dependency sched: Remove empty #ifdefs
2012-01-11Merge branch 'x86-platform-for-linus' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel config: Fix the APB_TIMER selection x86/mrst: Add additional debug prints for pb_keys x86/intel config: Revamp configuration to allow for Moorestown and Medfield x86/intel/scu/ipc: Match the changes in the x86 configuration x86/apb: Fix configuration constraints x86: Fix INTEL_MID silly x86/Kconfig: Cyclone-timer depends on x86-summit x86: Reduce clock calibration time during slave cpu startup x86/config: Revamp configuration for MID devices x86/sfi: Kill the IRQ as id hack
2012-01-11Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds1-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numa: Add constraints check for nid parameters mm, x86: Remove debug_pagealloc_enabled x86/mm: Initialize high mem before free_all_bootmem() arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map() x86: Fix mmap random address range x86, mm: Unify zone_sizes_init() x86, mm: Prepare zone_sizes_init() for unification x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32 x86, mm: Use max_pfn instead of highend_pfn x86, mm: Move zone init from paging_init() on 64-bit x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit
2012-01-10Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-4/+31
* 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Change the default setting of the nfs4_disable_idmapping parameter NFSv4: Save the owner/group name string when doing open NFS: Remove pNFS bloat from the generic write path pnfs-obj: Must return layout on IO error pnfs-obj: pNFS errors are communicated on iodata->pnfs_error NFS: Cache state owners after files are closed NFS: Clean up nfs4_find_state_owners_locked() NFSv4: include bitmap in nfsv4 get acl data nfs: fix a minor do_div portability issue NFSv4.1: cleanup comment and debug printk NFSv4.1: change nfs4_free_slot parameters for dynamic slots NFSv4.1: cleanup init and reset of session slot tables NFSv4.1: fix backchannel slotid off-by-one bug nfs: fix regression in handling of context= option in NFSv4 NFS - fix recent breakage to NFS error handling. NFS: Retry mounting NFSROOT SUNRPC: Clean up the RPCSEC_GSS service ticket requests
2012-01-10sched: Fix CONFIG_CGROUP_SCHED dependencyFabio Estevam1-1/+0
The dependency bug was pointed out by this build warning: warning: (SCHED_AUTOGROUP) selects CGROUP_SCHED which has unmet direct dependencies (CGROUPS && EXPERIMENTAL) Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Cc: a.p.zijlstra@chello.nl Cc: Fabio Estevam <festevam@gmail.com> Link: http://lkml.kernel.org/r/1326192383-5113-1-git-send-email-festevam@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-01-08Merge branch 'for-linus2' of ↵Linus Torvalds2-8/+10
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc/*/mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount * ...
2012-01-06vfs: prefer ->dentry->d_sb to ->mnt->mnt_sbAl Viro1-4/+6
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-01-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+11
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits) net: pack skb_shared_info more efficiently net_sched: red: split red_parms into parms and vars net_sched: sfq: extend limits cnic: Improve error recovery on bnx2x devices cnic: Re-init dev->stats_addr after chip reset net_sched: Bug in netem reordering bna: fix sparse warnings/errors bna: make ethtool_ops and strings const xgmac: cleanups net: make ethtool_ops const vmxnet3" make ethtool ops const xen-netback: make ops structs const virtio_net: Pass gfp flags when allocating rx buffers. ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call igb: reset PHY after recovering from PHY power down igb: add basic runtime PM support igb: Add support for byte queue limits. e1000: cleanup CE4100 MDIO registers access e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove ...
2012-01-06Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) cpu: Export cpu_up() rcu: Apply ACCESS_ONCE() to rcu_boost() return value Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" docs: Additional LWN links to RCU API rcu: Augment rcu_batch_end tracing for idle and callback state rcu: Add rcutorture tests for srcu_read_lock_raw() rcu: Make rcutorture test for hotpluggability before offlining CPUs driver-core/cpu: Expose hotpluggability to the rest of the kernel rcu: Remove redundant rcu_cpu_stall_suppress declaration rcu: Adaptive dyntick-idle preparation rcu: Keep invoking callbacks if CPU otherwise idle rcu: Irq nesting is always 0 on rcu_enter_idle_common rcu: Don't check irq nesting from rcu idle entry/exit rcu: Permit dyntick-idle with callbacks pending rcu: Document same-context read-side constraints rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass rcu: Remove dynticks false positives and RCU failures rcu: Reduce latency of rcu_prepare_for_idle() rcu: Eliminate RCU_FAST_NO_HZ grace-period hang rcu: Avoid needlessly IPIing CPUs at GP end ...
2012-01-05NFS: Retry mounting NFSROOTChuck Lever1-4/+31
Lukas Razik <linux@razik.name> reports that on his SPARC system, booting with an NFS root file system stopped working after commit 56463e50 "NFS: Use super.c for NFSROOT mount option parsing." We found that the network switch to which Lukas' client was attached was delaying access to the LAN after the client's NIC driver reported that its link was up. The delay was longer than the timeouts used in the NFS client during mounting. NFSROOT worked for Lukas before commit 56463e50 because in those kernels, the client's first operation was an rpcbind request to determine which port the NFS server was listening on. When that request failed after a long timeout, the client simply selected the default NFS port (2049). By that time the switch was allowing access to the LAN, and the mount succeeded. Neither of these client behaviors is desirable, so reverting 56463e50 is really not a choice. Instead, introduce a mechanism that retries the NFSROOT mount request several times. This is the same tactic that normal user space NFS mounts employ to overcome server and network delays. Signed-off-by: Lukas Razik <linux@razik.name> [ cel: match kernel coding style, add proper patch description ] [ cel: add exponential back-off ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Lukas Razik <linux@razik.name> Cc: stable@kernel.org # > 2.6.38 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-01-03init/initramfs.c: should use umode_tAl Viro1-4/+4
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-12-12Basic kernel memory functionality for the Memory ControllerGlauber Costa1-0/+11
This patch lays down the foundation for the kernel memory component of the Memory Controller. As of today, I am only laying down the following files: * memory.independent_kmem_limit * memory.kmem.limit_in_bytes (currently ignored) * memory.kmem.usage_in_bytes (always zero) Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: Paul Menage <paul@paulmenage.org> CC: Greg Thelen <gthelen@google.com> CC: Johannes Weiner <jweiner@redhat.com> CC: Michal Hocko <mhocko@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11rcu: Permit RCU_FAST_NO_HZ to be used by TREE_PREEMPT_RCUPaul E. McKenney1-5/+5
The new implementation of RCU_FAST_NO_HZ is compatible with preemptible RCU, so this commit removes the Kconfig restriction that previously prohibited this. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-06mm, x86: Remove debug_pagealloc_enabledStanislaw Gruszka1-5/+0
When (no)bootmem finish operation, it pass pages to buddy allocator. Since debug_pagealloc_enabled is not set, we will do not protect pages, what is not what we want with CONFIG_DEBUG_PAGEALLOC=y. To fix remove debug_pagealloc_enabled. That variable was introduced by commit 12d6f21e "x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y" to get more CPA (change page attribude) code testing. But currently we have CONFIG_CPA_DEBUG, which test CPA. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1322582711-14571-1-git-send-email-sgruszka@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06init/main.c: Execute lockdep_init() as early as possibleMing Lei1-2/+1
This patch fixes a lockdep warning on ARM platforms: [ 0.000000] WARNING: lockdep init error! Arch code didn't call lockdep_init() early enough? [ 0.000000] Call stack leading to lockdep invocation was: [ 0.000000] [<c00164bc>] save_stack_trace_tsk+0x0/0x90 [ 0.000000] [<ffffffff>] 0xffffffff The warning is caused by printk inside smp_setup_processor_id(). It is safe to do this because lockdep_init() doesn't depend on smp_setup_processor_id(), so improve things that printk can be called as early as possible without lockdep complaint. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321508072-23853-1-git-send-email-tom.leiming@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-05x86: Reduce clock calibration time during slave cpu startupJack Steiner1-0/+15
Reduce the startup time for slave cpus. Adds hooks for an arch-specific function for clock calibration. These hooks are used on x86. If a newly started cpu has the same phys_proc_id as a core already active, uses the TSC for the delay loop and has a CONSTANT_TSC, use the already-calculated value of loops_per_jiffy. This patch reduces the time required to start slave cpus on a 4096 cpu system from: 465 sec OLD 62 sec NEW This reduces boot time on a 4096p system by almost 7 minutes. Nice... Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org> [fix CONFIG_SMP=n build] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-11-06Merge branch 'upstream/jump-label-noearly' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: jump-label: initialize jump-label subsystem much earlier x86/jump_label: add arch_jump_label_transform_static() s390/jump-label: add arch_jump_label_transform_static() jump_label: add arch_jump_label_transform_static() to optimise non-live code updates sparc/jump_label: drop arch_jump_label_text_poke_early() x86/jump_label: drop arch_jump_label_text_poke_early() jump_label: if a key has already been initialized, don't nop it out stop_machine: make stop_machine safe and efficient to call early jump_label: use proper atomic_t initializer Conflicts: - arch/x86/kernel/jump_label.c Added __init_or_module to arch_jump_label_text_poke_early vs removal of that function entirely - kernel/stop_machine.c same patch ("stop_machine: make stop_machine safe and efficient to call early") merged twice, with whitespace fix in one version
2011-11-02sysctl: make CONFIG_SYSCTL_SYSCALL default to nWANG Cong1-2/+2
When I tried to send a patch to remove it, Andi told me we still need to keep compabitlies for old libc, so we can't remove this completely. Then just make it default to n and remove the doc from feature-removal-schedule.txt. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-02init: add root=PARTUUID=UUID/PARTNROFF=%d supportWill Drewry1-5/+43
Expand root=PARTUUID=UUID syntax to support selecting a root partition by integer offset from a known, unique partition. This approach provides similar properties to specifying a device and partition number, but using the UUID as the unique path prior to evaluating the offset. For example, root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1 selects the partition with UUID 99DE.. then select the next partition. This change is motivated by a particular usecase in Chromium OS where the bootloader can easily determine what partition it is on (by UUID) but doesn't perform general partition table walking. That said, support for this model provides a direct mechanism for the user to modify the root partition to boot without specifically needing to extract each UUID or update the bootloader explicitly when the root partition UUID is changed (if it is recreated to be larger, for instance). Pinning to a /boot-style partition UUID allows the arbitrary root partition reconfiguration/modifications with slightly less ambiguity than just [dev][partition] and less stringency than the specific root partition UUID. [sfr@canb.auug.org.au: fix init sections warning] Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-02init/do_mounts_rd.c: fix ramdisk identification for padded cramfsNeil Armstrong1-0/+14
When a cramfs ramdisk padded with 512 bytes is given to the kernel, the current identify_ramdisk_image function fails to identify it. Tested with a padded cramfs image on an ARM based board. Signed-off-by: Neil Armstrong <narmstrong@neotion.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Davidlohr Bueso <dave@gnu.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-10-26Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) llist: Add back llist_add_batch() and llist_del_first() prototypes sched: Don't use tasklist_lock for debug prints sched: Warn on rt throttling sched: Unify the ->cpus_allowed mask copy sched: Wrap scheduler p->cpus_allowed access sched: Request for idle balance during nohz idle load balance sched: Use resched IPI to kick off the nohz idle balance sched: Fix idle_cpu() llist: Remove cpu_relax() usage in cmpxchg loops sched: Convert to struct llist llist: Add llist_next() irq_work: Use llist in the struct irq_work logic llist: Return whether list is empty before adding in llist_add() llist: Move cpu_relax() to after the cmpxchg() llist: Remove the platform-dependent NMI checks llist: Make some llist functions inline sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch sched: Remove redundant test in check_preempt_tick() sched: Add documentation for bandwidth control sched: Return unused runtime on group dequeue ...
2011-10-26Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states rcu: Wire up RCU_BOOST_PRIO for rcutree rcu: Make rcu_torture_boost() exit loops at end of test rcu: Make rcu_torture_fqs() exit loops at end of test rcu: Permit rt_mutex_unlock() with irqs disabled rcu: Avoid having just-onlined CPU resched itself when RCU is idle rcu: Suppress NMI backtraces when stall ends before dump rcu: Prohibit grace periods during early boot rcu: Simplify unboosting checks rcu: Prevent early boot set_need_resched() from __rcu_pending() rcu: Dump local stack if cannot dump all CPUs' stacks rcu: Move __rcu_read_unlock()'s barrier() within if-statement rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier rcu: Make rcu_implicit_dynticks_qs() locals be correct size rcu: Eliminate in_irq() checks in rcu_enter_nohz() nohz: Remove nohz_cpu_mask rcu: Document interpretation of RCU-lockdep splats rcu: Allow rcutorture's stat_interval parameter to be changed at runtime ...
2011-10-26params: make dashes and underscores in parameter names truly equalMichal Schmidt1-2/+2
The user may use "foo-bar" for a kernel parameter defined as "foo_bar". Make sure it works the other way around too. Apply the equality of dashes and underscores on early_params and __setup params as well. The example given in Documentation/kernel-parameters.txt indicates that this is the intended behaviour. With the patch the kernel accepts "log-buf-len=1M" as expected. https://bugzilla.redhat.com/show_bug.cgi?id=744545 Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (neatened implementations)
2011-10-25jump-label: initialize jump-label subsystem much earlierJeremy Fitzhardinge1-0/+3
Initialize jump_labels much, much earlier, so they're available for use during system setup. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
2011-10-04Merge branch 'linus' into sched/coreIngo Molnar1-5/+14
Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-10-01Merge branch 'rcu/next' of git://github.com/paulmckrcu/linux into core/rcuIngo Molnar1-3/+3
2011-09-29bootup: move 'usermodehelper_enable()' a little earlierwangyanqing1-1/+1
Commit d5767c53535a ("bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()") moved 'usermodehelper_enable()' to end of do_basic_setup() to after the initcalls. But then I get failed to let uvesafb work on my computer, and lose the splash boot. So maybe we could start usermodehelper_enable a little early to make some task work that need eary init with the help of user mode. [ I would *really* prefer that initcalls not call into user space - even the real 'init' hasn't been execve'd yet, after all! But for uvesafb it really does look like we don't have much choice. I considered doing this when we mount the root filesystem, but depending on config options that is in multiple places. We could do the usermode helper enable as a rootfs_initcall().. So I'm just using wang yanqing's trivial patch. It's not wonderful, but it's simple and should work. We should revisit this some day, though. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-28rcu: Drive configuration directly from SMP and PREEMPTPaul E. McKenney1-3/+3
This commit eliminates the possibility of running TREE_PREEMPT_RCU when SMP=n and of running TINY_RCU when PREEMPT=y. People who really want these combinations can hand-edit init/Kconfig, but eliminating them as choices for production systems reduces the amount of testing required. It will also allow cutting out a few #ifdefs. Note that running TREE_RCU and TINY_RCU on single-CPU systems using SMP-built kernels is still supported. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-09-28bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()Linus Torvalds1-3/+1
Doing it just before starting to call into cpu_idle() made a sick kind of sense only because the original bug we fixed (see commit 288d5abec831: "Boot up with usermodehelper disabled") was about problems with some scheduler data structures not being initialized, and they had better be initialized at that point. But it really didn't make any other conceptual sense, and doing it after the initial "schedule()" call for the idle thread actually opened up a race: what if the main initialization thread did everything without needing to sleep, and got all the way into user land too? Without actually having scheduled back to the idle thread? Now, in normal circumstances that doesn't ever happen, but it looks like Richard Cochran triggered exactly that on his ARM IXP4xx machines: "I have some ARM IXP4xx based machines that use the two on chip MAC ports (aka NPEs). The NPE needs a firmware in order to function. Ever since the following commit [that 288d5abec831 one], it is no longer possible to bring up the interfaces during the init scripts." with a call trace showing an ioctl coming from user space. Richard says: "The init is busybox, and the startup script does mount, syslogd, and then ifup, so that all can go by quickly." The fix is to move the usermodehelper_enable() into the main 'init' thread, and just put it after we've done all our initcalls. By then, everything really should be up, but we've obviously not actually started the user-mode portion of init yet. Reported-and-tested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-09-21init: carefully handle loglevel option on kernel cmdline.Alexander Sverdlin1-2/+13
When a malformed loglevel value (for example "${abc}") is passed on the kernel cmdline, the loglevel itself is being set to 0. That then suppresses all following messages, including all the errors and crashes caused by other malformed cmdline options. This could make debugging process quite tricky. This patch leaves the previous value of loglevel if the new value is incorrect and reports an error code in this case. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@sysgo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-14sched: Introduce primitives to account for CFS bandwidth trackingPaul Turner1-0/+12
In this patch we introduce the notion of CFS bandwidth, partitioned into globally unassigned bandwidth, and locally claimed bandwidth. - The global bandwidth is per task_group, it represents a pool of unclaimed bandwidth that cfs_rqs can allocate from. - The local bandwidth is tracked per-cfs_rq, this represents allotments from the global pool bandwidth assigned to a specific cpu. Bandwidth is managed via cgroupfs, adding two new interfaces to the cpu subsystem: - cpu.cfs_period_us : the bandwidth period in usecs - cpu.cfs_quota_us : the cpu bandwidth (in usecs) that this tg will be allowed to consume over period above. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Nikhil Rao <ncrao@google.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110721184756.972636699@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-08-03Boot up with usermodehelper disabledLinus Torvalds1-1/+4
The core device layer sends tons of uevent notifications for each device it finds, and if the kernel has been built with a non-empty CONFIG_UEVENT_HELPER_PATH that will make us try to execute the usermode helper binary for all these events very early in the boot. Not only won't the root filesystem even be mounted at that point, we literally won't have necessarily even initialized all the process handling data structures at that point, which causes no end of silly problems even when the usermode helper doesn't actually succeed in executing. So just use our existing infrastructure to disable the usermodehelpers to make the kernel start out with them disabled. We enable them when we've at least initialized stuff a bit. Problems related to an uninitialized init_ipc_ns.ids[IPC_SHM_IDS].rw_mutex reported by various people. Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-08-03tmpfs: miscellaneous trivial cleanupsHugh Dickins1-1/+1
While it's at its least, make a number of boring nitpicky cleanups to shmem.c, mostly for consistency of variable naming. Things like "swap" instead of "entry", "pgoff_t index" instead of "unsigned long idx". And since everything else here is prefixed "shmem_", better change init_tmpfs() to shmem_init(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-25init: skip calibration delay if previously doneSameer Nanda1-1/+10
For each CPU, do the calibration delay only once. For subsequent calls, use the cached per-CPU value of loops_per_jiffy. This saves about 200ms of resume time on dual core Intel Atom N5xx based systems. This helps bring down the kernel resume time on such systems from about 500ms to about 300ms. [akpm@linux-foundation.org: make cpu_loops_per_jiffy static] [akpm@linux-foundation.org: clean up message text] [akpm@linux-foundation.org: fix things up after upstream rmk changes] Signed-off-by: Sameer Nanda <snanda@chromium.org> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Andrew Worsley <amworsley@gmail.com> Cc: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-25mm: remove the leftovers of noswapaccountWANG Cong1-2/+2
In commit a2c8990aed5ab ("memsw: remove noswapaccount kernel parameter"), Michal forgot to remove some left pieces of noswapaccount in the tree, this patch removes them all. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-23Merge branches 'x86-urgent-for-linus', 'core-debug-for-linus', ↵Linus Torvalds1-0/+2
'irq-core-for-linus' and 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Remove unused CHECK_IRQ_PER_CPU() * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools, x86: Fix 32-bit compile on 64-bit system
2011-07-22Merge branch 'timers-cleanup-for-linus' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mips: Fix i8253 clockevent fallout i8253: Cleanup outb/inb magic arm: Footbridge: Use common i8253 clockevent mips: Use common i8253 clockevent x86: Use common i8253 clockevent i8253: Create common clockevent implementation i8253: Export i8253_lock unconditionally pcpskr: MIPS: Make config dependencies finer grained pcspkr: Cleanup Kconfig dependencies i8253: Move remaining content and delete asm/i8253.h i8253: Consolidate definitions of PIT_LATCH x86: i8253: Consolidate definitions of global_clock_event i8253: Alpha, PowerPC: Remove unused asm/8253pit.h alpha: i8253: Cleanup remaining users of i8253pit.h i8253: Remove I8253_LOCK config i8253: Make pcsp sound driver use the shared i8253_lock i8253: Make pcspkr input driver use the shared i8253_lock i8253: Consolidate all kernel definitions of i8253_lock i8253: Unify all kernel declarations of i8253_lock i8253: Create linux/i8253.h and use it in all 8253 related files
2011-06-23Fix CPU spinlock lockups on secondary CPU bringupRussell King1-6/+8
Secondary CPU bringup typically calls calibrate_delay() during its initialization. However, calibrate_delay() modifies a global variable (loops_per_jiffy) used for udelay() and __delay(). A side effect of 71c696b1 ("calibrate: extract fall-back calculation into own helper") introduced in the 2.6.39 merge window means that we end up with a substantial period where loops_per_jiffy is zero. This causes the spinlock debugging code to malfunction: u64 loops = loops_per_jiffy * HZ; for (;;) { for (i = 0; i < loops; i++) { if (arch_spin_trylock(&lock->raw_lock)) return; __delay(1); } ... } by never calling arch_spin_trylock() - resulting in the CPU locking up in an infinite loop inside __spin_lock_debug(). Work around this by only writing to loops_per_jiffy only once we have completed all the calibration decisions. Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: <stable@kernel.org> (2.6.39-stable) -- Better solutions (such as omitting the calibration for secondary CPUs, or arranging for calibrate_delay() to return the LPJ value and leave it to the caller to decide where to store it) are a possibility, but would be much more invasive into each architecture. I think this is the best solution for -rc and stable, but it should be revisited for the next merge window. init/calibrate.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-17generic-ipi: Fix kexec boot crash by initializing call_single_queue before ↵Takao Indoh1-0/+1
enabling interrupts There is a problem that kdump(2nd kernel) sometimes hangs up due to a pending IPI from 1st kernel. Kernel panic occurs because IPI comes before call_single_queue is initialized. To fix the crash, rename init_call_single_data() to call_function_init() and call it in start_kernel() so that call_single_queue can be initialized before enabling interrupts. The details of the crash are: (1) 2nd kernel boots up (2) A pending IPI from 1st kernel comes when irqs are first enabled in start_kernel(). (3) Kernel tries to handle the interrupt, but call_single_queue is not initialized yet at this point. As a result, in the generic_smp_call_function_single_interrupt(), NULL pointer dereference occurs when list_replace_init() tries to access &q->list.next. Therefore this patch changes the name of init_call_single_data() to call_function_init() and calls it before local_irq_enable() in start_kernel(). Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Milton Miller <miltonm@bga.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-15gcov: disable CONFIG_CONSTRUCTORS when not needed by CONFIG_GCOV_KERNELJosh Triplett1-1/+0
CONFIG_CONSTRUCTORS controls support for running constructor functions at kernel init time. According to commit b99b87f70c7785ab ("kernel: constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this. However, CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it, and CONFIG_GCOV_KERNEL depends on it. Instead, default it to n and have CONFIG_GCOV_KERNEL select it, so that the normal case of CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n. Observed in the short list of =y values in a minimal kernel configuration. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15init/calibrate.c: remove annoying printkBorislav Petkov1-3/+0
Remove calibrate_delay_direct()'s KERN_DEBUG printk related to bogomips calculation as it appears when booting every core on setups with 'ignore_loglevel' which dmesg people scan for possible issues. As the message doesn't show very useful information to the widest audience of kernel boot message gazers, it should be removed. Introduced by commit d2b463135f84 ("init/calibrate.c: fix for critical bogoMIPS intermittent calculation failure"). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andrew Worsley <amworsley@gmail.com> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15uts: make default hostname configurable, rather than always using "(none)"Josh Triplett1-0/+9
The "hostname" tool falls back to setting the hostname to "localhost" if /etc/hostname does not exist. Distribution init scripts have the same fallback. However, if userspace never calls sethostname, such as when booting with init=/bin/sh, or otherwise booting a minimal system without the usual init scripts, the default hostname of "(none)" remains, unhelpfully appearing in various places such as prompts ("root@(none):~#") and logs. Furthermore, "(none)" doesn't typically resolve to anything useful. Make the default hostname configurable. This removes the need for the standard fallback, provides a useful default for systems that never call sethostname, and makes minimal systems that much more useful with less configuration. Distributions could choose to use "localhost" here to avoid the fallback, while embedded systems may wish to use a specific target hostname. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David Miller <davem@davemloft.net> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Kel Modderman <kel@otaku42.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-09pcspkr: Cleanup Kconfig dependenciesRalf Baechle1-1/+5
Lenghty lists of the kind "depends on ARCH1 || ARCH2 ... || ARCH123" are usually either wrong or too coarse grained. Or plain an ugly sin. [ tglx: Fixed up amigaone ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Gerhard Pircher <gerhard_pircher@gmx.net> Link: http://lkml.kernel.org/r/20110601180610.984881988@duck.linux-mips.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-09i8253: Consolidate all kernel definitions of i8253_lockRalf Baechle1-0/+1
Move them to drivers/clocksource/i8253.c and remove the implementations in arch/ [ tglx: Avoid the extra file in lib - folded arch patches in. The export will become conditional in a later step ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/20110601180610.221426078@duck.linux-mips.net Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>