summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2012-09-14x86, microcode, AMD: Fix broken ucode patch size checkAndreas Herrmann1-3/+4
commit 36bf50d7697be18c6bfd0401e037df10bff1e573 upstream. This issue was recently observed on an AMD C-50 CPU where a patch of maximum size was applied. Commit be62adb49294 ("x86, microcode, AMD: Simplify ucode verification") added current_size in get_matching_microcode(). This is calculated as size of the ucode patch + 8 (ie. size of the header). Later this is compared against the maximum possible ucode patch size for a CPU family. And of course this fails if the patch has already maximum size. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1344361461-10076-1-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14PARISC: Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the castsMel Gorman1-2/+2
commit bba3d8c3b3c0f2123be5bc687d1cddc13437c923 upstream. The following build error occured during a parisc build with swap-over-NFS patches applied. net/core/sock.c:274:36: error: initializer element is not constant net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks') net/core/sock.c:274:36: error: initializer element is not constant Dave Anglin says: > Here is the line in sock.i: > > struct static_key memalloc_socks = ((struct static_key) { .enabled = > ((atomic_t) { (0) }) }); The above line contains two compound literals. It also uses a designated initializer to initialize the field enabled. A compound literal is not a constant expression. The location of the above statement isn't fully clear, but if a compound literal occurs outside the body of a function, the initializer list must consist of constant expressions. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Make sure IPI handlers see data written by IPI sendersPaul Mackerras3-3/+16
commit 9fb1b36ca1234e64a5d1cc573175303395e3354d upstream. We have been observing hangs, both of KVM guest vcpu tasks and more generally, where a process that is woken doesn't properly wake up and continue to run, but instead sticks in TASK_WAKING state. This happens because the update of rq->wake_list in ttwu_queue_remote() is not ordered with the update of ipi_message in smp_muxed_ipi_message_pass(), and the reading of rq->wake_list in scheduler_ipi() is not ordered with the reading of ipi_message in smp_ipi_demux(). Thus it is possible for the IPI receiver not to see the updated rq->wake_list and therefore conclude that there is nothing for it to do. In order to make sure that anything done before smp_send_reschedule() is ordered before anything done in the resulting call to scheduler_ipi(), this adds barriers in smp_muxed_message_pass() and smp_ipi_demux(). The barrier in smp_muxed_message_pass() is a full barrier to ensure that there is a full ordering between the smp_send_reschedule() caller and scheduler_ipi(). In smp_ipi_demux(), we use xchg() rather than xchg_local() because xchg() includes release and acquire barriers. Using xchg() rather than xchg_local() makes sense given that ipi_message is not just accessed locally. This moves the barrier between setting the message and calling the cause_ipi() function into the individual cause_ipi implementations. Most of them -- those that used outb, out_8 or similar -- already had a full barrier because out_8 etc. include a sync before the MMIO store. This adds an explicit barrier in the two remaining cases. These changes made no measurable difference to the speed of IPIs as measured using a simple ping-pong latency test across two CPUs on different cores of a POWER7 machine. The analysis of the reason why processes were not waking up properly is due to Milton Miller. Reported-by: Milton Miller <miltonm@bga.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Restore correct DSCR in context switchAnton Blanchard2-6/+18
commit 714332858bfd40dcf8f741498336d93875c23aa7 upstream. During a context switch we always restore the per thread DSCR value. If we aren't doing explicit DSCR management (ie thread.dscr_inherit == 0) and the default DSCR changed while the process has been sleeping we end up with the wrong value. Check thread.dscr_inherit and select the default DSCR or per thread DSCR as required. This was found with the following test case, when running with more threads than CPUs (ie forcing context switching): http://ozlabs.org/~anton/junkcode/dscr_default_test.c With the four patches applied I can run a combination of all test cases successfully at the same time: http://ozlabs.org/~anton/junkcode/dscr_default_test.c http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Fix DSCR inheritance in copy_thread()Anton Blanchard1-10/+2
commit 1021cb268b3025573c4811f1dee4a11260c4507b upstream. If the default DSCR is non zero we set thread.dscr_inherit in copy_thread() meaning the new thread and all its children will ignore future updates to the default DSCR. This is not intended and is a change in behaviour that a number of our users have hit. We just need to inherit thread.dscr and thread.dscr_inherit from the parent which ends up being much simpler. This was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_default_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Keep thread.dscr and thread.dscr_inherit in syncAnton Blanchard2-2/+5
commit 00ca0de02f80924dfff6b4f630e1dff3db005e35 upstream. When we update the DSCR either via emulation of mtspr(DSCR) or via a change to dscr_default in sysfs we don't update thread.dscr. We will eventually update it at context switch time but there is a period where thread.dscr is incorrect. If we fork at this point we will copy the old value of thread.dscr into the child. To avoid this, always keep thread.dscr in sync with reality. This issue was found with the following testcase: http://ozlabs.org/~anton/junkcode/dscr_inherit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14powerpc: Update DSCR on all CPUs when writing sysfs dscr_defaultAnton Blanchard1-0/+8
commit 1b6ca2a6fe56e7697d57348646e07df08f43b1bb upstream. Writing to dscr_default in sysfs doesn't actually change the DSCR - we rely on a context switch on each CPU to do the work. There is no guarantee we will get a context switch in a reasonable amount of time so fire off an IPI to force an immediate change. This issue was found with the following test case: http://ozlabs.org/~anton/junkcode/dscr_explicit_test.c Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14x32: Use compat shims for {g,s}etsockoptMike Frysinger1-2/+4
commit 515c7af85ed92696c311c53d53cb4898ff32d784 upstream. Some of the arguments to {g,s}etsockopt are passed in userland pointers. If we try to use the 64bit entry point, we end up sometimes failing. For example, dhcpcd doesn't run in x32: # dhcpcd eth0 dhcpcd[1979]: version 5.5.6 starting dhcpcd[1979]: eth0: broadcasting for a lease dhcpcd[1979]: eth0: open_socket: Invalid argument dhcpcd[1979]: eth0: send_raw_packet: Bad file descriptor The code in particular is getting back EINVAL when doing: struct sock_fprog pf; setsockopt(s, SOL_SOCKET, SO_ATTACH_FILTER, &pf, sizeof(pf)); Diving into the kernel code, we can see: include/linux/filter.h: struct sock_fprog { unsigned short len; struct sock_filter __user *filter; }; net/core/sock.c: case SO_ATTACH_FILTER: ret = -EINVAL; if (optlen == sizeof(struct sock_fprog)) { struct sock_fprog fprog; ret = -EFAULT; if (copy_from_user(&fprog, optval, sizeof(fprog))) break; ret = sk_attach_filter(&fprog, sk); } break; arch/x86/syscalls/syscall_64.tbl: 54 common setsockopt sys_setsockopt 55 common getsockopt sys_getsockopt So for x64, sizeof(sock_fprog) is 16 bytes. For x86/x32, it's 8 bytes. This comes down to the pointer being 32bit for x32, which means we need to do structure size translation. But since x32 comes in directly to sys_setsockopt, it doesn't get translated like x86. After changing the syscall table and rebuilding glibc with the new kernel headers, dhcp runs fine in an x32 userland. Oddly, it seems like Linus noted the same thing during the initial port, but I guess that was missed/lost along the way: https://lkml.org/lkml/2011/8/26/452 [ hpa: tagging for -stable since this is an ABI fix. ] Bugzilla: https://bugs.gentoo.org/423649 Reported-by: Mads <mads@ab3.no> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Link: http://lkml.kernel.org/r/1345320697-15713-1-git-send-email-vapier@gentoo.org Cc: H. J. Lu <hjl.tools@gmail.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14mm: hugetlbfs: correctly populate shared pmdMichal Hocko1-5/+16
commit eb48c071464757414538c68a6033c8f8c15196f8 upstream. Each page mapped in a process's address space must be correctly accounted for in _mapcount. Normally the rules for this are straightforward but hugetlbfs page table sharing is different. The page table pages at the PMD level are reference counted while the mapcount remains the same. If this accounting is wrong, it causes bugs like this one reported by Larry Woodman: kernel BUG at mm/filemap.c:135! invalid opcode: 0000 [#1] SMP CPU 22 Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi] Pid: 18001, comm: mpitest Tainted: G W 3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2 RIP: 0010:[<ffffffff8112cfed>] [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170 Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20) Call Trace: delete_from_page_cache+0x40/0x80 truncate_hugepages+0x115/0x1f0 hugetlbfs_evict_inode+0x18/0x30 evict+0x9f/0x1b0 iput_final+0xe3/0x1e0 iput+0x3e/0x50 d_kill+0xf8/0x110 dput+0xe2/0x1b0 __fput+0x162/0x240 During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc() shared page tables with the check dst_pte == src_pte. The logic is if the PMD page is the same, they must be shared. This assumes that the sharing is between the parent and child. However, if the sharing is with a different process entirely then this check fails as in this diagram: parent | ------------>pmd src_pte----------> data page ^ other--------->pmd--------------------| ^ child-----------| dst_pte For this situation to occur, it must be possible for Parent and Other to have faulted and failed to share page tables with each other. This is possible due to the following style of race. PROC A PROC B copy_hugetlb_page_range copy_hugetlb_page_range src_pte == huge_pte_offset src_pte == huge_pte_offset !src_pte so no sharing !src_pte so no sharing (time passes) hugetlb_fault hugetlb_fault huge_pte_alloc huge_pte_alloc huge_pmd_share huge_pmd_share LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) LOCK(i_mmap_mutex) find nothing, no sharing UNLOCK(i_mmap_mutex) pmd_alloc pmd_alloc LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) LOCK(instantiation_mutex) fault UNLOCK(instantiation_mutex) These two processes are not poing to the same data page but are not sharing page tables because the opportunity was missed. When either process later forks, the src_pte == dst pte is potentially insufficient. As the check falls through, the wrong PTE information is copied in (harmless but wrong) and the mapcount is bumped for a page mapped by a shared page table leading to the BUG_ON. This patch addresses the issue by moving pmd_alloc into huge_pmd_share which guarantees that the shared pud is populated in the same critical section as pmd. This also means that huge_pte_offset test in huge_pmd_share is serialized correctly now which in turn means that the success of the sharing will be higher as the racing tasks see the pud and pmd populated together. Race identified and changelog written mostly by Mel Gorman. {akpm@linux-foundation.org: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style] Reported-by: Larry Woodman <lwoodman@redhat.com> Tested-by: Larry Woodman <lwoodman@redhat.com> Reviewed-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Ken Chen <kenchen@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14alpha: Don't export SOCK_NONBLOCK to user space.Michael Cree1-0/+2
commit a2fa3ccd7b43665fe14cb562761a6c3d26a1d13f upstream. Currently we export SOCK_NONBLOCK to user space but that conflicts with the definition from glibc leading to compilation errors in user programs (e.g. see Debian bug #658460). The generic socket.h restricts the definition of SOCK_NONBLOCK to the kernel, as does the MIPS specific socket.h, so let's do the same on Alpha. Signed-off-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14alpha: fix fpu.h usage in userspaceMike Frysinger1-0/+2
commit 0be421862b857e61964435ffcaa7499cf77a5e5a upstream. After commit ec2212088c42 ("Disintegrate asm/system.h for Alpha"), the fpu.h header which we install for userland started depending on special_insns.h which is not installed. However, fpu.h only uses that for __KERNEL__ code, so protect the inclusion the same way to avoid build breakage in glibc: /usr/include/asm/fpu.h:4:31: fatal error: asm/special_insns.h: No such file or directory Reported-by: Matt Turner <mattst88@gentoo.org> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Acked-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: imx: select CPU_FREQ_TABLE when neededArnd Bergmann1-0/+1
commit f637c4c9405e21f44cf0045eaf77eddd3a79ca5a upstream. The i.MX cpufreq implementation uses the CPU_FREQ_TABLE helpers, so it needs to select that code to be built. This problem has apparently existed since the i.MX cpufreq code was first merged in v2.6.37. Building IMX without CPU_FREQ_TABLE results in: arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_exit': arch/arm/plat-mxc/cpufreq.c:173: undefined reference to `cpufreq_frequency_table_put_attr' arch/arm/plat-mxc/built-in.o: In function `mxc_set_target': arch/arm/plat-mxc/cpufreq.c:84: undefined reference to `cpufreq_frequency_table_target' arch/arm/plat-mxc/built-in.o: In function `mxc_verify_speed': arch/arm/plat-mxc/cpufreq.c:65: undefined reference to `cpufreq_frequency_table_verify' arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_init': arch/arm/plat-mxc/cpufreq.c:154: undefined reference to `cpufreq_frequency_table_cpuinfo' arch/arm/plat-mxc/cpufreq.c:162: undefined reference to `cpufreq_frequency_table_get_attr' Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shawn Guo <shawn.guo@linaro.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Yong Shen <yong.shen@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: imx6: spin the cpu until hardware takes it downShawn Guo1-20/+3
commit c944b0b9354ea06ffb0c8a7178949f1185f9f499 upstream. Though commit 602bf40 (ARM: imx6: exit coherency when shutting down a cpu) improves the stability of imx6q cpu hotplug a lot, there are still hangs seen with a more stressful hotplug testing. It's expected that once imx_enable_cpu(cpu, false) is called, the cpu will be taken down by hardware immediately, and the code after that will not get any chance to execute. However, this is not always the case from the testing. The cpu could possibly be alive for a few cycles before hardware actually takes it down. So rather than letting cpu execute some code that could cause a hang in these cycles, let's make the cpu spin there and wait for hardware to take it down. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M.Konrad Rzeszutek Wilk1-1/+8
commit c96aae1f7f393387d160211f60398d58463a7e65 upstream. When we are finished with return PFNs to the hypervisor, then populate it back, and also mark the E820 MMIO and E820 gaps as IDENTITY_FRAMEs, we then call P2M to set areas that can be used for ballooning. We were off by one, and ended up over-writting a P2M entry that most likely was an IDENTITY_FRAME. For example: 1-1 mapping on 40000->40200 1-1 mapping on bc558->bc5ac 1-1 mapping on bc5b4->bc8c5 1-1 mapping on bc8c6->bcb7c 1-1 mapping on bcd00->100000 Released 614 pages of unused memory Set 277889 page(s) to 1-1 mapping Populating 40200-40466 pfn range: 614 pages added => here we set from 40466 up to bc559 P2M tree to be INVALID_P2M_ENTRY. We should have done it up to bc558. The end result is that if anybody is trying to construct a PTE for PFN bc558 they end up with ~PAGE_PRESENT. Reported-by-and-Tested-by: Andre Przywara <andre.przywara@amd.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: S3C24XX: Fix s3c2410_dma_enqueue parametersHeiko Stuebner1-1/+1
commit b01858c7806e7e6f6121da2e51c9222fc4d21dc6 upstream. Commit d670ac019f60 (ARM: SAMSUNG: DMA Cleanup as per sparse) changed the prototype of the s3c2410_dma_* functions to use the enum dma_ch instead of an generic unsigned int. In the s3c24xx dma.c s3c2410_dma_enqueue seems to have been forgotten, the other functions there were changed correctly. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: S3C24XX: Add missing DMACH_DT_PROPHeiko Stuebner1-1/+2
commit e1267371eacf2cbcf580e41f9e64a986cdaf5c1d upstream. Commit 2b90807549 (spi: s3c64xx: add device tree support) requires the DMACH_DT_PROP element in the dma_ch enum. It's not used on non-DT platforms but has to be present nevertheless. So mimic the dummy-add of DMACH_DT_PROP on s3c64xx for s3c24xx machines, to correct the build breakage for the s3c24xx variants using the s3c64xx-spi-driver. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: OMAP2+: Fix dmtimer set source clock failureJon Hunter1-1/+1
commit 54f32a35f4d3a653a18a2c8c239f19ae060bd803 upstream. Calling the dmtimer function omap_dm_timer_set_source() fails if following a call to pm_runtime_put() to disable the timer. For example the following sequence would fail to set the parent clock ... omap_dm_timer_stop(gptimer); omap_dm_timer_set_source(gptimer, OMAP_TIMER_SRC_32_KHZ); The following error message would be seen ... omap_dm_timer_set_source: failed to set timer_32k_ck as parent The problem is that, by design, pm_runtime_put() simply decrements the usage count and returns before the timer has actually been disabled. Therefore, setting the parent clock failed because the timer was still active when the trying to set the parent clock. Setting a parent clock will fail if the clock you are setting the parent of has a non-zero usage count. To ensure that this does not fail use pm_runtime_put_sync() when disabling the timer. Note that this will not be seen on OMAP1 devices, because these devices do not use the clock framework for dmtimers. Signed-off-by: Jon Hunter <jon-hunter@ti.com> Acked-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: 7489/1: errata: fix workaround for erratum #720789 on UP systemsWill Deacon1-3/+3
commit 730a8128cd8978467eb1cf546b11014acb57d433 upstream. Commit 5a783cbc4836 ("ARM: 7478/1: errata: extend workaround for erratum #720789") added workarounds for erratum #720789 to the range TLB invalidation functions with the observation that the erratum only affects SMP platforms. However, when running an SMP_ON_UP kernel on a uniprocessor platform we must take care to preserve the ASID as the workaround is not required. This patch ensures that we don't set the ASID to 0 when flushing the TLB on such a system, preserving the original behaviour with the workaround disabled. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: 7488/1: mm: use 5 bits for swapfile type encodingWill Deacon1-3/+3
commit f5f2025ef3e2cdb593707cbf87378761f17befbe upstream. Page migration encodes the pfn in the offset field of a swp_entry_t. For LPAE, we support physical addresses of up to 36 bits (due to sparsemem limitations with the size of page flags), requiring 24 bits to represent a pfn. A further 3 bits are used to encode a swp_entry into a pte, leaving 5 bits for the type field. Furthermore, the core code defines MAX_SWAPFILES_SHIFT as 5, so the additional type bit does not get used. This patch reduces the width of the type field to 5 bits, allowing us to create up to 31 swapfiles of 64GB each. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: 7487/1: mm: avoid setting nG bit for user mappings that aren't presentWill Deacon2-18/+18
commit 47f1204329237a0f8655f5a9f14a38ac81946ca1 upstream. Swap entries are encoding in ptes such that !pte_present(pte) and pte_file(pte). The remaining bits of the descriptor are used to identify the swapfile and offset within it to the swap entry. When writing such a pte for a user virtual address, set_pte_at unconditionally sets the nG bit, which (in the case of LPAE) will corrupt the swapfile offset and lead to a BUG: [ 140.494067] swap_free: Unused swap offset entry 000763b4 [ 140.509989] BUG: Bad page map in process rs:main Q:Reg pte:0ec76800 pmd:8f92e003 This patch fixes the problem by only setting the nG bit for user mappings that are actually present. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-09-14ARM: 7483/1: vfp: only advertise VFPv4 in hwcaps if CONFIG_VFPv3 is enabledWill Deacon1-0/+2
commit 3d9fb0038a9b02febb01efc79a4a5d97f1822a90 upstream. VFPv4 support depends on the VFPv3 context save/restore code, so only advertise support in the hwcaps if the kernel can actually handle it. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-26xen: mark local pages as FOREIGN in the m2p_overrideStefano Stabellini1-0/+36
commit b9e0d95c041ca2d7ad297ee37c2e9cfab67a188f upstream. When the frontend and the backend reside on the same domain, even if we add pages to the m2p_override, these pages will never be returned by mfn_to_pfn because the check "get_phys_to_machine(pfn) != mfn" will always fail, so the pfn of the frontend will be returned instead (resulting in a deadlock because the frontend pages are already locked). INFO: task qemu-system-i38:1085 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. qemu-system-i38 D ffff8800cfc137c0 0 1085 1 0x00000000 ffff8800c47ed898 0000000000000282 ffff8800be4596b0 00000000000137c0 ffff8800c47edfd8 ffff8800c47ec010 00000000000137c0 00000000000137c0 ffff8800c47edfd8 00000000000137c0 ffffffff82213020 ffff8800be4596b0 Call Trace: [<ffffffff81101ee0>] ? __lock_page+0x70/0x70 [<ffffffff81a0fdd9>] schedule+0x29/0x70 [<ffffffff81a0fe80>] io_schedule+0x60/0x80 [<ffffffff81101eee>] sleep_on_page+0xe/0x20 [<ffffffff81a0e1ca>] __wait_on_bit_lock+0x5a/0xc0 [<ffffffff81101ed7>] __lock_page+0x67/0x70 [<ffffffff8106f750>] ? autoremove_wake_function+0x40/0x40 [<ffffffff811867e6>] ? bio_add_page+0x36/0x40 [<ffffffff8110b692>] set_page_dirty_lock+0x52/0x60 [<ffffffff81186021>] bio_set_pages_dirty+0x51/0x70 [<ffffffff8118c6b4>] do_blockdev_direct_IO+0xb24/0xeb0 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00 [<ffffffff8118ca95>] __blockdev_direct_IO+0x55/0x60 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00 [<ffffffff811e91c8>] ext3_direct_IO+0xf8/0x390 [<ffffffff811e71a0>] ? ext3_get_blocks_handle+0xe00/0xe00 [<ffffffff81004b60>] ? xen_mc_flush+0xb0/0x1b0 [<ffffffff81104027>] generic_file_aio_read+0x737/0x780 [<ffffffff813bedeb>] ? gnttab_map_refs+0x15b/0x1e0 [<ffffffff811038f0>] ? find_get_pages+0x150/0x150 [<ffffffff8119736c>] aio_rw_vect_retry+0x7c/0x1d0 [<ffffffff811972f0>] ? lookup_ioctx+0x90/0x90 [<ffffffff81198856>] aio_run_iocb+0x66/0x1a0 [<ffffffff811998b8>] do_io_submit+0x708/0xb90 [<ffffffff81199d50>] sys_io_submit+0x10/0x20 [<ffffffff81a18d69>] system_call_fastpath+0x16/0x1b The explanation is in the comment within the code: We need to do this because the pages shared by the frontend (xen-blkfront) can be already locked (lock_page, called by do_read_cache_page); when the userspace backend tries to use them with direct_IO, mfn_to_pfn returns the pfn of the frontend, so do_blockdev_direct_IO is going to try to lock the same pages again resulting in a deadlock. A simplified call graph looks like this: pygrub QEMU ----------------------------------------------- do_read_cache_page io_submit | | lock_page ext3_direct_IO | bio_add_page | lock_page Internally the xen-blkback uses m2p_add_override to swizzle (temporarily) a 'struct page' to have a different MFN (so that it can point to another guest). It also can easily find out whether another pfn corresponding to the mfn exists in the m2p, and can set the FOREIGN bit in the p2m, making sure that mfn_to_pfn returns the pfn of the backend. This allows the backend to perform direct_IO on these pages, but as a side effect prevents the frontend from using get_user_pages_fast on them while they are being shared with the backend. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-26s390/compat: fix mmap compat system callsHeiko Carstens1-2/+0
commit e85871218513c54f7dfdb6009043cb638f2fecbe upstream. The native 31 bit and the compat behaviour for the mmap system calls differ: In native 31 bit mode the passed in address for the mmap system call will be unmodified passed to sys_mmap_pgoff(). In compat mode however the passed in address will be modified with compat_ptr() which masks out the most significant bit. The result is that in native 31 bit mode each mmap request (with MAP_FIXED) will fail where the most significat bit is set, while in compat mode it may succeed. This odd behaviour was introduced with d3815898 "[S390] mmap: add missing compat_ptr conversion to both mmap compat syscalls". To restore a consistent behaviour accross native and compat mode this patch functionally reverts the above mentioned commit. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-26s390/compat: fix compat wrappers for process_vm system callsHeiko Carstens1-2/+2
commit 82aabdb6f1eb61e0034ec23901480f5dd23db7c4 upstream. The compat wrappers incorrectly called the non compat versions of the system process_vm system calls. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15Input: eeti_ts: pass gpio value instead of IRQArnd Bergmann1-1/+1
commit 4eef6cbfcc03b294d9d334368a851b35b496ce53 upstream. The EETI touchscreen asserts its IRQ line as soon as it has data in its internal buffers. The line is automatically deasserted once all data has been read via I2C. Hence, the driver has to monitor the GPIO line and cannot simply rely on the interrupt handler reception. In the current implementation of the driver, irq_to_gpio() is used to determine the GPIO number from the i2c_client's IRQ value. As irq_to_gpio() is not available on all platforms, this patch changes this and makes the driver ignore the passed in IRQ. Instead, a GPIO is added to the platform_data struct and gpio_to_irq is used to derive the IRQ from that GPIO. If this fails, bail out. The driver is only able to work in environments where the touchscreen GPIO can be mapped to an IRQ. Without this patch, building raumfeld_defconfig results in: drivers/input/touchscreen/eeti_ts.c: In function 'eeti_ts_irq_active': drivers/input/touchscreen/eeti_ts.c:65:2: error: implicit declaration of function 'irq_to_gpio' [-Werror=implicit-function-declaration] Signed-off-by: Daniel Mack <zonque@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Sven Neumann <s.neumann@raumfeld.com> Cc: linux-input@vger.kernel.org Cc: Haojian Zhuang <haojian.zhuang@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: dts: imx53-ard: add regulators for lan9220Shawn Guo1-0/+20
commit 1eec0c569523782392b5e6245effddb626213b8c upstream. Since commit c7e963f (net/smsc911x: Add regulator support), the lan9220 device tree probe fails on imx53-ard board, because the commit makes VDD33A and VDDVARIO supplies mandatory for the driver. Add a fixed dummy 3V3 supplying lan9220 to fix the regression. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: mxs: Remove MMAP_MIN_ADDR setting from mxs_defconfigMarek Vasut1-1/+0
commit 3bed491c8d28329e34f8a31e3fe64d03f3a350f1 upstream. The CONFIG_DEFAULT_MMAP_MIN_ADDR was set to 65536 in mxs_defconfig, this caused severe breakage of userland applications since the upper limit for ARM is 32768. By default CONFIG_DEFAULT_MMAP_MIN_ADDR is set to 4096 and can also be changed via /proc/sys/vm/mmap_min_addr if needed. Quoting Russell King [1]: "4096 is also fine for ARM too. There's not much point in having defconfigs change it - that would just be pure noise in the config files." the CONFIG_DEFAULT_MMAP_MIN_ADDR can be removed from the defconfig altogether. This problem was introduced by commit cde7c41 (ARM: configs: add defconfig for mach-mxs). [1] http://marc.info/?l=linux-arm-kernel&m=134401593807820&w=2 Signed-off-by: Marek Vasut <marex@denx.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Wolfgang Denk <wd@denx.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, microcode: Sanitize per-cpu microcode reloading interfaceBorislav Petkov1-7/+19
commit c9fc3f778a6a215ace14ee556067c73982b6d40f upstream. Microcode reloading in a per-core manner is a very bad idea for both major x86 vendors. And the thing is, we have such interface with which we can end up with different microcode versions applied on different cores of an otherwise homogeneous wrt (family,model,stepping) system. So turn off the possibility of doing that per core and allow it only system-wide. This is a minimal fix which we'd like to see in stable too thus the more-or-less arbitrary decision to allow system-wide reloading only on the BSP: $ echo 1 > /sys/devices/system/cpu/cpu0/microcode/reload ... and disable the interface on the other cores: $ echo 1 > /sys/devices/system/cpu/cpu23/microcode/reload -bash: echo: write error: Invalid argument Also, allowing the reload only from one CPU (the BSP in that case) doesn't allow the reload procedure to degenerate into an O(n^2) deal when triggering reloads from all /sys/devices/system/cpu/cpuX/microcode/reload sysfs nodes simultaneously. A more generic fix will follow. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1340280437-7718-2-git-send-email-bp@amd64.org Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, microcode: microcode_core.c simple_strtoul cleanupShuah Khan1-5/+4
commit e826abd523913f63eb03b59746ffb16153c53dc4 upstream. Change reload_for_cpu() in kernel/microcode_core.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Reviewed-by: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1336324264.2897.9.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15random: remove rand_initialize_irq()Theodore Ts'o1-1/+0
commit c5857ccf293968348e5eb4ebedc68074de3dcda6 upstream. With the new interrupt sampling system, we are no longer using the timer_rand_state structure in the irq descriptor, so we can stop initializing it now. [ Merged in fixes from Sedat to find some last missing references to rand_initialize_irq() ] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15x86, nops: Missing break resulting in incorrect selection on IntelAlan Cox1-1/+1
commit d6250a3f12edb3a86db9598ffeca3de8b4a219e9 upstream. The Intel case falls through into the generic case which then changes the values. For cases like the P6 it doesn't do the right thing so this seems to be a screwup. Signed-off-by: Alan Cox <alan@linux.intel.com> Link: http://lkml.kernel.org/n/tip-lww2uirad4skzjlmrm0vru8o@git.kernel.org Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: Fix undefined instruction exception handlingRussell King4-62/+92
commit 15ac49b65024f55c4371a53214879a9c77c4fbf9 upstream. While trying to get a v3.5 kernel booted on the cubox, I noticed that VFP does not work correctly with VFP bounce handling. This is because of the confusion over 16-bit vs 32-bit instructions, and where PC is supposed to point to. The rule is that FP handlers are entered with regs->ARM_pc pointing at the _next_ instruction to be executed. However, if the exception is not handled, regs->ARM_pc points at the faulting instruction. This is easy for ARM mode, because we know that the next instruction and previous instructions are separated by four bytes. This is not true of Thumb2 though. Since all FP instructions are 32-bit in Thumb2, it makes things easy. We just need to select the appropriate adjustment. Do this by moving the adjustment out of do_undefinstr() into the assembly code, as only the assembly code knows whether it's dealing with a 32-bit or 16-bit instruction. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7480/1: only call smp_send_stop() on SMPJavier Martinez Canillas1-1/+2
commit c5dff4ffd327088d85035bec535b7d0c9ea03151 upstream. On reboot or poweroff (machine_shutdown()) a call to smp_send_stop() is made (to stop the others CPU's) when CONFIG_SMP=y. arch/arm/kernel/process.c: void machine_shutdown(void) { #ifdef CONFIG_SMP smp_send_stop(); #endif } smp_send_stop() calls the function pointer smp_cross_call(), which is set on the smp_init_cpus() function for OMAP processors. arch/arm/mach-omap2/omap-smp.c: void __init smp_init_cpus(void) { ... set_smp_cross_call(gic_raise_softirq); ... } But the ARM setup_arch() function only calls smp_init_cpus() if CONFIG_SMP=y && is_smp(). arm/kernel/setup.c: void __init setup_arch(char **cmdline_p) { ... #ifdef CONFIG_SMP if (is_smp()) smp_init_cpus(); #endif ... } Newer OMAP CPU's are SMP machines so omap2plus_defconfig sets CONFIG_SMP=y. Unfortunately on an OMAP UP machine is_smp() returns false and smp_init_cpus() is never called and the smp_cross_call() function remains NULL. If the machine is rebooted or powered off, smp_send_stop() will be called (since CONFIG_SMP=y) leading to the following error: [ 42.815551] Restarting system. [ 42.819030] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 42.827667] pgd = d7a74000 [ 42.830566] [00000000] *pgd=96ce7831, *pte=00000000, *ppte=00000000 [ 42.837249] Internal error: Oops: 80000007 [#1] SMP ARM [ 42.842773] Modules linked in: [ 42.846008] CPU: 0 Not tainted (3.5.0-rc3-next-20120622-00002-g62e87ba-dirty #44) [ 42.854278] PC is at 0x0 [ 42.856994] LR is at smp_send_stop+0x4c/0xe4 [ 42.861511] pc : [<00000000>] lr : [<c00183a4>] psr: 60000013 [ 42.861511] sp : d6c85e70 ip : 00000000 fp : 00000000 [ 42.873626] r10: 00000000 r9 : d6c84000 r8 : 00000002 [ 42.879150] r7 : c07235a0 r6 : c06dd2d0 r5 : 000f4241 r4 : d6c85e74 [ 42.886047] r3 : 00000000 r2 : 00000000 r1 : 00000006 r0 : d6c85e74 [ 42.892944] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 42.900482] Control: 10c5387d Table: 97a74019 DAC: 00000015 [ 42.906555] Process reboot (pid: 1166, stack limit = 0xd6c842f8) [ 42.912902] Stack: (0xd6c85e70 to 0xd6c86000) [ 42.917510] 5e60: c07235a0 00000000 00000000 d6c84000 [ 42.926177] 5e80: 01234567 c00143d0 4321fedc c00511bc d6c85ebc 00000168 00000460 00000000 [ 42.934814] 5ea0: c1017950 a0000013 c1017900 d8014390 d7ec3858 c0498e48 c1017950 00000000 [ 42.943481] 5ec0: d6ddde10 d6c85f78 00000003 00000000 d6ddde10 d6c84000 00000000 00000000 [ 42.952117] 5ee0: 00000002 00000000 00000000 c0088c88 00000002 00000000 00000000 c00f4b90 [ 42.960784] 5f00: 00000000 d6c85ebc d8014390 d7e311c8 60000013 00000103 00000002 d6c84000 [ 42.969421] 5f20: c00f3274 d6e00a00 00000001 60000013 d6c84000 00000000 00000000 c00895d4 [ 42.978057] 5f40: 00000002 d8007c80 d781f000 c00f6150 d8010cc0 c00f3274 d781f000 d6c84000 [ 42.986694] 5f60: c0013020 d6e00a00 00000001 20000010 0001257c ef000000 00000000 c00895d4 [ 42.995361] 5f80: 00000002 00000001 00000003 00000000 00000001 00000003 00000000 00000058 [ 43.003997] 5fa0: c00130c8 c0012f00 00000001 00000003 fee1dead 28121969 01234567 00000002 [ 43.012634] 5fc0: 00000001 00000003 00000000 00000058 00012584 0001257c 00000001 00000000 [ 43.021270] 5fe0: 000124bc bec5cc6c 00008f9c 4a2f7c40 20000010 fee1dead 00000000 00000000 [ 43.029968] [<c00183a4>] (smp_send_stop+0x4c/0xe4) from [<c00143d0>] (machine_restart+0xc/0x4c) [ 43.039154] [<c00143d0>] (machine_restart+0xc/0x4c) from [<c00511bc>] (sys_reboot+0x144/0x1f0) [ 43.048278] [<c00511bc>] (sys_reboot+0x144/0x1f0) from [<c0012f00>] (ret_fast_syscall+0x0/0x3c) [ 43.057464] Code: bad PC value [ 43.060760] ---[ end trace c3988d1dd0b8f0fb ]--- Add a check so smp_cross_call() is only called when there is more than one CPU on-line. Signed-off-by: Javier Martinez Canillas <javier at dowhile0.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7479/1: mm: avoid NULL dereference when flushing gate_vma with VIVT cachesWill Deacon1-2/+6
commit b74253f78400f9a4b42da84bb1de7540b88ce7c4 upstream. The vivt_flush_cache_{range,page} functions check that the mm_struct of the VMA being flushed has been active on the current CPU before performing the cache maintenance. The gate_vma has a NULL mm_struct pointer and, as such, will cause a kernel fault if we try to flush it with the above operations. This happens during ELF core dumps, which include the gate_vma as it may be useful for debugging purposes. This patch adds checks to the VIVT cache flushing functions so that VMAs with a NULL mm_struct are flushed unconditionally (the vectors page may be dirty if we use it to store the current TLS pointer). Reported-by: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org> Tested-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7478/1: errata: extend workaround for erratum #720789Will Deacon1-0/+12
commit 5a783cbc48367cfc7b65afc75430953dfe60098f upstream. Commit cdf357f1 ("ARM: 6299/1: errata: TLBIASIDIS and TLBIMVAIS operations can broadcast a faulty ASID") replaced by-ASID TLB flushing operations with all-ASID variants to workaround A9 erratum #720789. This patch extends the workaround to include the tlb_range operations, which were overlooked by the original patch. Tested-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7477/1: vfp: Always save VFP state in vfp_pm_suspend on UPColin Cross1-0/+6
commit 24b35521b8ddf088531258f06f681bb7b227bf47 upstream. vfp_pm_suspend should save the VFP state in suspend after any lazy context switch. If it only saves when the VFP is enabled, the state can get lost when, on a UP system: Thread 1 uses the VFP Context switch occurs to thread 2, VFP is disabled but the VFP context is not saved Thread 2 initiates suspend vfp_pm_suspend is called with the VFP disabled, and the unsaved VFP context of Thread 1 in the registers Modify vfp_pm_suspend to save the VFP context whenever vfp_current_hw_state is not NULL. Includes a fix from Ido Yariv <ido@wizery.com>, who pointed out that on SMP systems, the state pointer can be pointing to a freed task struct if a task exited on another cpu, fixed by using #ifndef CONFIG_SMP in the new if clause. Signed-off-by: Colin Cross <ccross@android.com> Cc: Barry Song <bs14@csr.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ido Yariv <ido@wizery.com> Cc: Daniel Drake <dsd@laptop.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7476/1: vfp: only clear vfp state for current cpu in vfp_pm_suspendColin Cross1-1/+1
commit a84b895a2348f0dbff31b71ddf954f70a6cde368 upstream. vfp_pm_suspend runs on each cpu, only clear the hardware state pointer for the current cpu. Prevents a possible crash if one cpu clears the hw state pointer when another cpu has already checked if it is valid. Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15ARM: 7466/1: disable interrupt before spinning endlesslyShawn Guo1-0/+2
commit 98bd8b96b26db3399a48202318dca4aaa2515355 upstream. The CPU will endlessly spin at the end of machine_halt and machine_restart calls. However, this will lead to a soft lockup warning after about 20 seconds, if CONFIG_LOCKUP_DETECTOR is enabled, as system timer is still alive. Disable interrupt before going to spin endlessly, so that the lockup warning will never be seen. Reported-by: Marek Vasut <marex@denx.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-15Redefine ATOMIC_INIT and ATOMIC64_INIT to drop the castsTony Luck1-2/+2
commit a119365586b0130dfea06457f584953e0ff6481d upstream. The following build error occured during a ia64 build with swap-over-NFS patches applied. net/core/sock.c:274:36: error: initializer element is not constant net/core/sock.c:274:36: error: (near initialization for 'memalloc_socks') net/core/sock.c:274:36: error: initializer element is not constant This is identical to a parisc build error. Fengguang Wu, Mel Gorman and James Bottomley did all the legwork to track the root cause of the problem. This fix and entire commit log is shamelessly copied from them with one extra detail to change a dubious runtime use of ATOMIC_INIT() to atomic_set() in drivers/char/mspec.c Dave Anglin says: > Here is the line in sock.i: > > struct static_key memalloc_socks = ((struct static_key) { .enabled = > ((atomic_t) { (0) }) }); The above line contains two compound literals. It also uses a designated initializer to initialize the field enabled. A compound literal is not a constant expression. The location of the above statement isn't fully clear, but if a compound literal occurs outside the body of a function, the initializer list must consist of constant expressions. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09m68k: Correct the Atari ALLOWINT definitionMikael Pettersson1-2/+2
commit c663600584a596b5e66258cc10716fb781a5c2c9 upstream. Booting a 3.2, 3.3, or 3.4-rc4 kernel on an Atari using the `nfeth' ethernet device triggers a WARN_ONCE() in generic irq handling code on the first irq for that device: WARNING: at kernel/irq/handle.c:146 handle_irq_event_percpu+0x134/0x142() irq 3 handler nfeth_interrupt+0x0/0x194 enabled interrupts Modules linked in: Call Trace: [<000299b2>] warn_slowpath_common+0x48/0x6a [<000299c0>] warn_slowpath_common+0x56/0x6a [<00029a4c>] warn_slowpath_fmt+0x2a/0x32 [<0005b34c>] handle_irq_event_percpu+0x134/0x142 [<0005b34c>] handle_irq_event_percpu+0x134/0x142 [<0000a584>] nfeth_interrupt+0x0/0x194 [<001ba0a8>] schedule_preempt_disabled+0x0/0xc [<0005b37a>] handle_irq_event+0x20/0x2c [<0005add4>] generic_handle_irq+0x2c/0x3a [<00002ab6>] do_IRQ+0x20/0x32 [<0000289e>] auto_irqhandler_fixup+0x4/0x6 [<00003144>] cpu_idle+0x22/0x2e [<001b8a78>] printk+0x0/0x18 [<0024d112>] start_kernel+0x37a/0x386 [<0003021d>] __do_proc_dointvec+0xb1/0x366 [<0003021d>] __do_proc_dointvec+0xb1/0x366 [<0024c31e>] _sinittext+0x31e/0x9c0 After invoking the irq's handler the kernel sees !irqs_disabled() and concludes that the handler erroneously enabled interrupts. However, debugging shows that !irqs_disabled() is true even before the handler is invoked, which indicates a problem in the platform code rather than the specific driver. The warning does not occur in 3.1 or older kernels. It turns out that the ALLOWINT definition for Atari is incorrect. The Atari definition of ALLOWINT is ~0x400, the stated purpose of that is to avoid taking HSYNC interrupts. irqs_disabled() returns true if the 3-bit ipl & 4 is non-zero. The nfeth interrupt runs at ipl 3 (it's autovector 3), but 3 & 4 is zero so irqs_disabled() is false, and the warning above is generated. When interrupts are explicitly disabled, ipl is set to 7. When they are enabled, ipl is masked with ALLOWINT. On Atari this will result in ipl = 3, which blocks interrupts at ipl 3 and below. So how come nfeth interrupts at ipl 3 are received at all? That's because ipl is reset to 2 by Atari-specific code in default_idle(), again with the stated purpose of blocking HSYNC interrupts. This discrepancy means that ipl 3 can remain blocked for longer than intended. Both default_idle() and falcon_hblhandler() identify HSYNC with ipl 2, and the "Atari ST/.../F030 Hardware Register Listing" agrees, but ALLOWINT is defined as if HSYNC was ipl 3. [As an experiment I modified default_idle() to reset ipl to 3, and as expected that resulted in all nfeth interrupts being blocked.] The fix is simple: define ALLOWINT as ~0x500 instead. This makes arch_local_irq_enable() consistent with default_idle(), and prevents the !irqs_disabled() problems for ipl 3 interrupts. Tested on Atari running in an Aranym VM. Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Tested-by: Michael Schmitz <schmitzmic@googlemail.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09m68k: Make sys_atomic_cmpxchg_32 work on classic m68kAndreas Schwab1-2/+6
commit 9e2760d18b3cf179534bbc27692c84879c61b97c upstream. User space access must always go through uaccess accessors, since on classic m68k user space and kernel space are completely separate. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Thorsten Glaser <tg@debian.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09posix_types.h: Cleanup stale __NFDBITS and related definitionsJosh Boyer1-1/+1
commit 8ded2bbc1845e19c771eb55209aab166ef011243 upstream. Recently, glibc made a change to suppress sign-conversion warnings in FD_SET (glibc commit ceb9e56b3d1). This uncovered an issue with the kernel's definition of __NFDBITS if applications #include <linux/types.h> after including <sys/select.h>. A build failure would be seen when passing the -Werror=sign-compare and -D_FORTIFY_SOURCE=2 flags to gcc. It was suggested that the kernel should either match the glibc definition of __NFDBITS or remove that entirely. The current in-kernel uses of __NFDBITS can be replaced with BITS_PER_LONG, and there are no uses of the related __FDELT and __FDMASK defines. Given that, we'll continue the cleanup that was started with commit 8b3d1cda4f5f ("posix_types: Remove fd_set macros") and drop the remaining unused macros. Additionally, linux/time.h has similar macros defined that expand to nothing so we'll remove those at the same time. Reported-by: Jeff Law <law@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Boyer <jwboyer@redhat.com> [ .. and fix up whitespace as per akpm ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09s390/mm: fix fault handling for page table walk caseHeiko Carstens1-6/+7
commit 008c2e8f247f0a8db1e8e26139da12f3a3abcda0 upstream. Make sure the kernel does not incorrectly create a SIGBUS signal during user space accesses: For user space accesses in the switched addressing mode case the kernel may walk page tables and access user address space via the kernel mapping. If a page table entry is invalid the function __handle_fault() gets called in order to emulate a page fault and trigger all the usual actions like paging in a missing page etc. by calling handle_mm_fault(). If handle_mm_fault() returns with an error fixup handling is necessary. For the switched addressing mode case all errors need to be mapped to -EFAULT, so that the calling uaccess function can return -EFAULT to user space. Unfortunately the __handle_fault() incorrectly calls do_sigbus() if VM_FAULT_SIGBUS is set. This however should only happen if a page fault was triggered by a user space instruction. For kernel mode uaccesses the correct action is to only return -EFAULT. So user space may incorrectly see SIGBUS signals because of this bug. For current machines this would only be possible for the switched addressing mode case in conjunction with futex operations. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09s390/mm: downgrade page table after fork of a 31 bit processMartin Schwidefsky4-8/+25
commit 0f6f281b731d20bfe75c13f85d33f3f05b440222 upstream. The downgrade of the 4 level page table created by init_new_context is currently done only in start_thread31. If a 31 bit process forks the new mm uses a 4 level page table, including the task size of 2<<42 that goes along with it. This is incorrect as now a 31 bit process can map memory beyond 2GB. Define arch_dup_mmap to do the downgrade after fork. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09s390/idle: fix sequence handling vs cpu hotplugHeiko Carstens2-3/+2
commit 0008204ffe85d23382d6fd0f971f3f0fbe70bae2 upstream. The s390 idle accounting code uses a sequence counter which gets used when the per cpu idle statistics get updated and read. One assumption on read access is that only when the sequence counter is even and did not change while reading all values the result is valid. On cpu hotplug however the per cpu data structure gets initialized via a cpu hotplug notifier on CPU_ONLINE. CPU_ONLINE however is too late, since the onlined cpu is already running and might access the per cpu data. Worst case is that the data structure gets initialized while an idle thread is updating its idle statistics. This will result in an uneven sequence counter after an update. As a result user space tools like top, which access /proc/stat in order to get idle stats, will busy loop waiting for the sequence counter to become even again, which will never happen until the queried cpu will update its idle statistics again. And even then the sequence counter will only have an even value for a couple of cpu cycles. Fix this by moving the initialization of the per cpu idle statistics to cpu_init(). I prefer that solution in favor of changing the notifier to CPU_UP_PREPARE, which would be a different solution to the problem. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faultsTony Luck1-2/+4
commit 6751ed65dc6642af64f7b8a440a75563c8aab7ae upstream. In commit dad1743e5993f1 ("x86/mce: Only restart instruction after machine check recovery if it is safe") we fixed mce_notify_process() to force a signal to the current process if it was not restartable (RIPV bit not set in MCG_STATUS). But doing it here means that the process doesn't get told the virtual address of the fault via siginfo_t->si_addr. This would prevent application level recovery from the fault. Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so that we will provide the right information with the signal. Signed-off-by: Tony Luck <tony.luck@intel.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09ARM: OMAP2+: OPP: Fix to ensure check of right oppdef after bad oneNishanth Menon1-2/+1
commit b110547e586eb5825bc1d04aa9147bff83b57672 upstream. Commit 9fa2df6b90786301b175e264f5fa9846aba81a65 (ARM: OMAP2+: OPP: allow OPP enumeration to continue if device is not present) makes the logic: for (i = 0; i < opp_def_size; i++) { <snip> if (!oh || !oh->od) { <snip> continue; } <snip> opp_def++; } In short, the moment we hit a "Bad OPP", we end up looping the list comparing against the bad opp definition pointer for the rest of the iteration count. Instead, increment opp_def in the for loop itself and allow continue to be used in code without much thought so that we check the next set of OPP definition pointers :) Cc: Steve Sakoman <steve@sakoman.com> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc/85xx: use the BRx registers to enable indirect mode on the P1022DSTimur Tabi2-29/+93
commit 6bd825f02966be8ba544047cab313d6032c23819 upstream. In order to enable the DIU video controller on the P1022DS, the FPGA needs to be switched to "indirect mode", where the localbus is disabled and the FPGA is accessed via writes to localbus chip select signals CS0 and CS1. To obtain the address of CS0 and CS1, the platform driver uses an "indirect pixis mode" device tree node. This node assumes that the localbus 'ranges' property is sorted in chip-select order. That is, reg value 0 maps to CS0, reg value 1 maps to CS1, etc. This is how the 'ranges' property is supposed to be arranged. Unfortunately, the 'ranges' property is often mis-arranged, and not just on the P1022DS. Linux normally does not care, since it does not program the localbus. But the indirect-mode code on the P1022DS does care. The "proper" fix is to have U-Boot fix the 'ranges' property, but this would be too cumbersome. The names and 'reg' properties of all the localbus devices would also need to be updated, and determining which localbus device maps to which chip select is board-specific. Instead, we determine the CS0/CS1 base addresses the same way that U-boot does -- by reading the BRx registers directly and mapping them to physical addresses. This code is simpler and more reliable, and it does not require a U-boot or device tree change. Since the indirect pixis device tree node is no longer needed, the node is deleted from the DTS. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc/eeh: Check handle_eeh_events() return valueKleber Sacilotto de Souza1-2/+4
commit 10db8d212864cb6741df7d7fafda5ab6661f6f88 upstream. Function eeh_event_handler() dereferences the pointer returned by handle_eeh_events() without checking, causing a crash if NULL was returned, which is expected in some situations. This patch fixes this bug by checking for the value returned by handle_eeh_events() before dereferencing it. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-08-09powerpc: Add "memory" attribute for mfmsr()Tiejun Chen1-1/+2
commit b416c9a10baae6a177b4f9ee858b8d309542fbef upstream. Add "memory" attribute in inline assembly language as a compiler barrier to make sure 4.6.x GCC don't reorder mfmsr(). Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>