summaryrefslogtreecommitdiff
path: root/arch/ia64
AgeCommit message (Collapse)AuthorFilesLines
2013-04-30dump_stack: unify debug information printed by show_regs()Tejun Heo1-2/+2
show_regs() is inherently arch-dependent but it does make sense to print generic debug information and some archs already do albeit in slightly different forms. This patch introduces a generic function to print debug information from show_regs() so that different archs print out the same information and it's much easier to modify what's printed. show_regs_print_info() prints out the same debug info as dump_stack() does plus task and thread_info pointers. * Archs which didn't print debug info now do. alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r, metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc, um, xtensa * Already prints debug info. Replaced with show_regs_print_info(). The printed information is superset of what used to be there. arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86 * s390 is special in that it used to print arch-specific information along with generic debug info. Heiko and Martin think that the arch-specific extra isn't worth keeping s390 specfic implementation. Converted to use the generic version. Note that now all archs print the debug info before actual register dumps. An example BUG() dump follows. kernel BUG at /work/os/work/kernel/workqueue.c:4841! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000 RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6 RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246 RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001 RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650 0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760 Call Trace: [<ffffffff81000312>] do_one_initcall+0x122/0x170 [<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8 [<ffffffff81c47760>] ? rest_init+0x140/0x140 [<ffffffff81c4776e>] kernel_init+0xe/0xf0 [<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0 [<ffffffff81c47760>] ? rest_init+0x140/0x140 ... v2: Typo fix in x86-32. v3: CPU number dropped from show_regs_print_info() as dump_stack_print_info() has been updated to print it. s390 specific implementation dropped as requested by s390 maintainers. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30dump_stack: implement arch-specific hardware description in task dumpsTejun Heo1-0/+1
x86 and ia64 can acquire extra hardware identification information from DMI and print it along with task dumps; however, the usage isn't consistent. * x86 show_regs() collects vendor, product and board strings and print them out with PID, comm and utsname. Some of the information is printed again later in the same dump. * warn_slowpath_common() explicitly accesses the DMI board and prints it out with "Hardware name:" label. This applies to both x86 and ia64 but is irrelevant on all other archs. * ia64 doesn't show DMI information on other non-WARN dumps. This patch introduces arch-specific hardware description used by dump_stack(). It can be set by calling dump_stack_set_arch_desc() during boot and, if exists, printed out in a separate line with "Hardware name:" label. dmi_set_dump_stack_arch_desc() is added which sets arch-specific description from DMI data. It uses dmi_ids_string[] which is set from dmi_present() used for DMI debug message. It is superset of the information x86 show_regs() is using. The function is called from x86 and ia64 boot code right after dmi_scan_machine(). This makes the explicit DMI handling in warn_slowpath_common() unnecessary. Removed. show_regs() isn't yet converted to use generic debug information printing and this patch doesn't remove the duplicate DMI handling in x86 show_regs(). The next patch will unify show_regs() handling and remove the duplication. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #3 Hardware name: empty empty/S3992, BIOS 080011 10/26/2007 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f500 ffffffff82228240 0000000000000040 ffffffff8234a08e 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f500>] warn_slowpath_common+0x70/0xa0 [<ffffffff8108f54a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a0c3>] init_workqueues+0x35/0x505 ... v2: Use the same string as the debug message from dmi_present() which also contains BIOS information. Move hardware name into its own line as warn_slowpath_common() did. This change was suggested by Bjorn Helgaas. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30dump_stack: consolidate dump_stack() implementations and unify their behaviorsTejun Heo1-8/+0
Both dump_stack() and show_stack() are currently implemented by each architecture. show_stack(NULL, NULL) dumps the backtrace for the current task as does dump_stack(). On some archs, dump_stack() prints extra information - pid, utsname and so on - in addition to the backtrace while the two are identical on other archs. The usages in arch-independent code of the two functions indicate show_stack(NULL, NULL) should print out bare backtrace while dump_stack() is used for debugging purposes when something went wrong, so it does make sense to print additional information on the task which triggered dump_stack(). There's no reason to require archs to implement two separate but mostly identical functions. It leads to unnecessary subtle information. This patch expands the dummy fallback dump_stack() implementation in lib/dump_stack.c such that it prints out debug information (taken from x86) and invokes show_stack(NULL, NULL) and drops arch-specific dump_stack() implementations in all archs except blackfin. Blackfin's dump_stack() does something wonky that I don't understand. Debug information can be printed separately by calling dump_stack_print_info() so that arch-specific dump_stack() implementation can still emit the same debug information. This is used in blackfin. This patch brings the following behavior changes. * On some archs, an extra level in backtrace for show_stack() could be printed. This is because the top frame was determined in dump_stack() on those archs while generic dump_stack() can't do that reliably. It can be compensated by inlining dump_stack() but not sure whether that'd be necessary. * Most archs didn't use to print debug info on dump_stack(). They do now. An example WARN dump follows. WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505() Hardware name: empty Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9 0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48 ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c 0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58 Call Trace: [<ffffffff81c614dc>] dump_stack+0x19/0x1b [<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0 [<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20 [<ffffffff8234a071>] init_workqueues+0x35/0x505 ... v2: CPU number added to the generic debug info as requested by s390 folks and dropped the s390 specific dump_stack(). This loses %ksp from the debug message which the maintainers think isn't important enough to keep the s390-specific dump_stack() implementation. dump_stack_print_info() is moved to kernel/printk.c from lib/dump_stack.c. Because linkage is per objecct file, dump_stack_print_info() living in the same lib file as generic dump_stack() means that archs which implement custom dump_stack() - at this point, only blackfin - can't use dump_stack_print_info() as that will bring in the generic version of dump_stack() too. v1 The v1 patch broke build on blackfin due to this issue. The build breakage was reported by Fengguang Wu. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits] Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sam Ravnborg <sam@ravnborg.org> Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30Merge tag 'pm+acpi-3.10-rc1' of ↵Linus Torvalds5-472/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael J Wysocki: - ARM big.LITTLE cpufreq driver from Viresh Kumar. - exynos5440 cpufreq driver from Amit Daniel Kachhap. - cpufreq core cleanup and code consolidation from Viresh Kumar and Stratos Karafotis. - cpufreq scalability improvement from Nathan Zimmer. - AMD "frequency sensitivity feedback" powersave bias for the ondemand cpufreq governor from Jacob Shin. - cpuidle code consolidation and cleanups from Daniel Lezcano. - ARM OMAP cpuidle fixes from Santosh Shilimkar and Daniel Lezcano. - ACPICA fixes and other improvements from Bob Moore, Jung-uk Kim, Lv Zheng, Yinghai Lu, Tang Chen, Colin Ian King, and Linn Crosetto. - ACPI core updates related to hotplug from Toshi Kani, Paul Bolle, Yasuaki Ishimatsu, and Rafael J Wysocki. - Intel Lynxpoint LPSS (Low-Power Subsystem) support improvements from Rafael J Wysocki and Andy Shevchenko. * tag 'pm+acpi-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (192 commits) cpufreq: Revert incorrect commit 5800043 cpufreq: MAINTAINERS: Add co-maintainer cpuidle: add maintainer entry ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points ARM: s3c64xx: cpuidle: use init/exit common routine cpufreq: pxa2xx: initialize variables ACPI: video: correct acpi_video_bus_add error processing SH: cpuidle: use init/exit common routine ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y ACPI: Fix wrong parameter passed to memblock_reserve cpuidle: fix comment format pnp: use %*phC to dump small buffers isapnp: remove debug leftovers ARM: imx: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: calxeda: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: OMAP4: cpuidle: use init/exit common routine ...
2013-04-30Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds5-74/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP/hotplug changes from Ingo Molnar: "This is a pretty large, multi-arch series unifying and generalizing the various disjunct pieces of idle routines that architectures have historically copied from each other and have grown in random, wildly inconsistent and sometimes buggy directions: 101 files changed, 455 insertions(+), 1328 deletions(-) this went through a number of review and test iterations before it was committed, it was tested on various architectures, was exposed to linux-next for quite some time - nevertheless it might cause problems on architectures that don't read the mailing lists and don't regularly test linux-next. This cat herding excercise was motivated by the -rt kernel, and was brought to you by Thomas "the Whip" Gleixner." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) idle: Remove GENERIC_IDLE_LOOP config switch um: Use generic idle loop ia64: Make sure interrupts enabled when we "safe_halt()" sparc: Use generic idle loop idle: Remove unused ARCH_HAS_DEFAULT_IDLE bfin: Fix typo in arch_cpu_idle() xtensa: Use generic idle loop x86: Use generic idle loop unicore: Use generic idle loop tile: Use generic idle loop tile: Enter idle with preemption disabled sh: Use generic idle loop score: Use generic idle loop s390: Use generic idle loop powerpc: Use generic idle loop parisc: Use generic idle loop openrisc: Use generic idle loop mn10300: Use generic idle loop mips: Use generic idle loop microblaze: Use generic idle loop ...
2013-04-30Merge tag 'v3.9' into efi-for-tip2Matt Fleming2-64/+14
Resolve conflicts for Ingo. Conflicts: drivers/firmware/Kconfig drivers/firmware/efivars.c Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-04-29kvm: KVM_CAP_IOMMU only available with device assignmentAlex Williamson1-0/+2
Fix build with CONFIG_PCI unset by linking KVM_CAP_IOMMU to device assignment config option. It has no purpose otherwise. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-04-29mm: speedup in __early_pfn_to_nidRuss Anderson1-1/+14
When booting on a large memory system, the kernel spends considerable time in memmap_init_zone() setting up memory zones. Analysis shows significant time spent in __early_pfn_to_nid(). The routine memmap_init_zone() checks each PFN to verify the nid is valid. __early_pfn_to_nid() sequentially scans the list of pfn ranges to find the right range and returns the nid. This does not scale well. On a 4 TB (single rack) system there are 308 memory ranges to scan. The higher the PFN the more time spent sequentially spinning through memory ranges. Since memmap_init_zone() increments pfn, it will almost always be looking for the same range as the previous pfn, so check that range first. If it is in the same range, return that nid. If not, scan the list as before. A 4 TB (single rack) UV1 system takes 512 seconds to get through the zone code. This performance optimization reduces the time by 189 seconds, a 36% improvement. A 2 TB (single rack) UV2 system goes from 212.7 seconds to 99.8 seconds, a 112.9 second (53%) reduction. [akpm@linux-foundation.org: make the statics __meminitdata] [akpm@linux-foundation.org: fix comment formatting] [akpm@linux-foundation.org: fix ia64, per yinghai] [akpm@linux-foundation.org: add missing semicolon, per Tony] Signed-off-by: Russ Anderson <rja@sgi.com> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: "Luck, Tony" <tony.luck@intel.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Lin Feng <linfeng@cn.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29sparse-vmemmap: specify vmemmap population range in bytesJohannes Weiner1-4/+3
The sparse code, when asking the architecture to populate the vmemmap, specifies the section range as a starting page and a number of pages. This is an awkward interface, because none of the arch-specific code actually thinks of the range in terms of 'struct page' units and always translates it to bytes first. In addition, later patches mix huge page and regular page backing for the vmemmap. For this, they need to call vmemmap_populate_basepages() on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these are not necessarily multiples of the 'struct page' size and so this unit is too coarse. Just translate the section range into bytes once in the generic sparse code, then pass byte ranges down the stack. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Bernhard Schmidt <Bernhard.Schmidt@lrz.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: David S. Miller <davem@davemloft.net> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm/hugetlb: add more arch-defined huge_pte functionsGerald Schaefer1-0/+1
Commit abf09bed3cce ("s390/mm: implement software dirty bits") introduced another difference in the pte layout vs. the pmd layout on s390, thoroughly breaking the s390 support for hugetlbfs. This requires replacing some more pte_xxx functions in mm/hugetlbfs.c with a huge_pte_xxx version. This patch introduces those huge_pte_xxx functions and their generic implementation in asm-generic/hugetlb.h, which will now be included on all architectures supporting hugetlbfs apart from s390. This change will be a no-op for those architectures. [akpm@linux-foundation.org: fix warning] Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Hillf Danton <dhillf@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts] Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm/IA64: use common help functions to free reserved pagesJiang Liu1-19/+4
Use common help functions to free reserved pages. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29mm, show_mem: suppress page counts in non-blockable contextsDavid Rientjes2-0/+4
On large systems with a lot of memory, walking all RAM to determine page types may take a half second or even more. In non-blockable contexts, the page allocator will emit a page allocation failure warning unless __GFP_NOWARN is specified. In such contexts, irqs are typically disabled and such a lengthy delay may even result in NMI watchdog timeouts. To fix this, suppress the page walk in such contexts when printing the page allocation failure warning. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mgorman@suse.de> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29ia64: Don't use create_proc_read_entry()David Howells3-348/+301
Don't use create_proc_read_entry() as that is deprecated, but rather use proc_create_data() and seq_file instead. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tony Luck <tony.luck@intel.com> cc: Fenghua Yu <fenghua.yu@intel.com> cc: linux-ia64@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-29Merge tag 'tty-3.10-rc1' of ↵Linus Torvalds1-13/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver update from Greg Kroah-Hartman: "Here's the big tty/serial driver merge request for 3.10-rc1 Once again, Jiri has a number of TTY driver fixes and cleanups, and Peter Hurley came through with a bunch of ldisc fixes that resolve a number of reported issues. There are some other serial driver cleanups as well. All of these have been in the linux-next tree for a while" * tag 'tty-3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (117 commits) tty/serial/sirf: fix MODULE_DEVICE_TABLE serial: mxs: drop superfluous {get|put}_device serial: mxs: fix buffer overflow ARM: PL011: add support for extended FIFO-size of PL011-r1p5 serial_core.c: add put_device() after device_find_child() tty: Fix unsafe bit ops in tty_throttle_safe/unthrottle_safe serial: sccnxp: Replace pdata.init/exit with regulator API serial: sccnxp: Do not override device name TTY: pty, fix compilation warning TTY: rocket, fix compilation warning TTY: ircomm: fix DTR being raised on hang up TTY: synclinkmp: fix DTR being raised on hang up TTY: synclink_gt: fix DTR being raised on hang up TTY: synclink: fix DTR being raised on hang up serial: 8250_dw: Fix the stub for dw8250_probe_acpi() serial: 8250_dw: Convert to devm_ioremap() serial: 8250_dw: Set port capabilities based on CPR register serial: 8250_dw: Let ACPI code extract the DMA client info serial: 8250_dw: Support clk framework also with ACPI serial: 8250_dw: Enable runtime PM ...
2013-04-29Merge tag 'pci-v3.10-changes' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.10 merge window: PCI device hotplug - Remove ACPI PCI subdrivers (Jiang Liu, Myron Stowe) - Make acpiphp builtin only, not modular (Jiang Liu) - Add acpiphp mutual exclusion (Jiang Liu) Power management - Skip "PME enabled/disabled" messages when not supported (Rafael Wysocki) - Fix fallback to PCI_D0 (Rafael Wysocki) Miscellaneous - Factor quirk_io_region (Yinghai Lu) - Cache MSI capability offsets & cleanup (Gavin Shan, Bjorn Helgaas) - Clean up EISA resource initialization and logging (Bjorn Helgaas) - Fix prototype warnings (Andy Shevchenko, Bjorn Helgaas) - MIPS: Initialize of_node before scanning bus (Gabor Juhos) - Fix pcibios_get_phb_of_node() declaration "weak" annotation (Gabor Juhos) - Add MSI INTX_DISABLE quirks for AR8161/AR8162/etc (Xiong Huang) - Fix aer_inject return values (Prarit Bhargava) - Remove PME/ACPI dependency (Andrew Murray) - Use shared PCI_BUS_NUM() and PCI_DEVID() (Shuah Khan)" * tag 'pci-v3.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits) vfio-pci: Use cached MSI/MSI-X capabilities vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Remove "extern" from function declarations PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h PCI: Use msix_table_size() directly, drop multi_msix_capable() PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros PCI: Drop is_64bit_address() and is_mask_bit_support() macros PCI: Drop msi_data_reg() macro PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc PCI: Clean up MSI/MSI-X capability #defines PCI: Use cached MSI-X cap while enabling MSI-X PCI: Use cached MSI cap while enabling MSI interrupts PCI: Remove MSI/MSI-X cap check in pci_msi_check_device() PCI: Cache MSI/MSI-X capability offsets in struct pci_dev PCI: Use u8, not int, for PM capability offset [SCSI] megaraid_sas: Use correct #define for MSI-X capability PCI: Remove "extern" from function declarations ...
2013-04-29Merge tag 'please-pull-misc-3.10' of ↵Linus Torvalds13-78/+91
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 fixes from Tony Luck: "Bundle of miscellaneous ia64 fixes for 3.10 merge window." * tag 'please-pull-misc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: Add size restriction to the kdump documentation Fix example error_injection_tool Fix build error for numa_clear_node() under IA64 Fix initialization of CMCI/CMCP interrupts Change "select DMAR" to "select INTEL_IOMMU" Wrong asm register contraints in the kvm implementation Wrong asm register contraints in the futex implementation Remove cast for kmalloc return value Fix kexec oops when iosapic was removed iosapic: fix a minor typo in comments Add WB/UC check for early_ioremap Fix broken fsys_getppid() tiocx: check retval from bus_register()
2013-04-28kvm: Allow build-time configuration of KVM device assignmentAlex Williamson4-8/+14
We hope to at some point deprecate KVM legacy device assignment in favor of VFIO-based assignment. Towards that end, allow legacy device assignment to be deconfigured. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-04-28Merge branch 'pm-cpufreq'Rafael J. Wysocki5-472/+3
* pm-cpufreq: (57 commits) cpufreq: MAINTAINERS: Add co-maintainer cpufreq: pxa2xx: initialize variables ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y cpufreq: cpu0: Put cpu parent node after using it cpufreq: ARM big LITTLE: Adapt to latest cpufreq updates cpufreq: ARM big LITTLE: put DT nodes after using them cpufreq: Don't call __cpufreq_governor() for drivers without target() cpufreq: exynos5440: Protect OPP search calls with RCU lock cpufreq: dbx500: Round to closest available freq cpufreq: Call __cpufreq_governor() with correct policy->cpus mask cpufreq / intel_pstate: Optimize intel_pstate_set_policy cpufreq: OMAP: instantiate omap-cpufreq as a platform_driver arm: exynos: Enable OPP library support for exynos5440 cpufreq: exynos: Remove error return even if no soc is found cpufreq: exynos: Add cpufreq driver for exynos5440 cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor cpufreq: ondemand: allow custom powersave_bias_target handler to be registered cpufreq: convert cpufreq_driver to using RCU cpufreq: powerpc/platforms/cell: move cpufreq driver to drivers/cpufreq cpufreq: sparc: move cpufreq driver to drivers/cpufreq ... Conflicts: MAINTAINERS (with commit a8e39c3 from pm-cpuidle) drivers/cpufreq/cpufreq_governor.h (with commit beb0ff3)
2013-04-26KVM: IA64: Carry non-ia64 changes into ia64Alexander Graf3-1/+3
We changed a few things in non-ia64 code paths. This patch blindly applies the changes to the ia64 code as well, hoping it proves useful in case anyone revives the ia64 kvm code. Signed-off-by: Alexander Graf <agraf@suse.de>
2013-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-64/+13
Conflicts: drivers/net/ethernet/emulex/benet/be_main.c drivers/net/ethernet/intel/igb/igb_main.c drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c include/net/scm.h net/batman-adv/routing.c net/ipv4/tcp_input.c The e{uid,gid} --> {uid,gid} credentials fix conflicted with the cleanup in net-next to now pass cred structs around. The be2net driver had a bug fix in 'net' that overlapped with the VLAN interface changes by Patrick McHardy in net-next. An IGB conflict existed because in 'net' the build_skb() support was reverted, and in 'net-next' there was a comment style fix within that code. Several batman-adv conflicts were resolved by making sure that all calls to batadv_is_my_mac() are changed to have a new bat_priv first argument. Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO rewrite in 'net-next', mostly overlapping changes. Thanks to Stephen Rothwell and Antonio Quartulli for help with several of these merge resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>
2013-04-17KVM: ia64: Fix kvm_vm_ioctl_irq_lineYang Zhang1-2/+4
Fix the compile error with kvm_vm_ioctl_irq_line. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-04-17idle: Remove GENERIC_IDLE_LOOP config switchThomas Gleixner1-1/+0
All archs are converted over. Remove the config switch and the fallback code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-17ia64: Make sure interrupts enabled when we "safe_halt()"Luck, Tony1-0/+1
In commit d166991234347215dc23fc9dc15a63a83a1a54e1 idle: Implement generic idle function Thomas Gleixner cleaned up many things but perturbed some fragile code that was keeping ia64 alive. So we started seeing: WARNING: at kernel/cpu/idle.c:94 cpu_idle_loop+0x360/0x380() and other unpleasantness like system hangs during boot. We really shouldn't ever halt with interrupts disabled. Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: magnus.damm@gmail.com Link: http://lkml.kernel.org/r/516d9a0c26048eae9c@agluck-desk.sc.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-16KVM: Call common update function when ioapic entry changed.Yang Zhang1-6/+0
Both TMR and EOI exit bitmap need to be updated when ioapic changed or vcpu's id/ldr/dfr changed. So use common function instead eoi exit bitmap specific function. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-04-16Remove GENERIC_GPIO config optionAlexandre Courbot1-3/+0
GENERIC_GPIO has been made equivalent to GPIOLIB in architecture code and all driver code has been switch to depend on GPIOLIB. It is thus safe to have GENERIC_GPIO removed. Signed-off-by: Alexandre Courbot <acourbot@nvidia.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Grant Likely <grant.likely@secretlab.ca>
2013-04-14Merge 3.9-rc7 intp tty-nextGreg Kroah-Hartman1-64/+13
We want the fixes here. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-12ia64/PCI: Implement pcibios_{add|remove}_bus() hooksJiang Liu1-0/+11
Implement pcibios_{add|remove}_bus() hooks for IA64 platforms. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Yinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com>
2013-04-10cpufreq: ia64: move cpufreq driver to drivers/cpufreqViresh Kumar5-474/+3
This patch moves cpufreq driver of IA64 architecture to drivers/cpufreq. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-09Merge branch 'for-linus' of ↵Linus Torvalds1-64/+13
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A nasty bug in fs/namespace.c caught by Andrey + a couple of less serious unpleasantness - ecryptfs misc device playing hopeless games with try_module_get() and palinfo procfs support being... not quite correctly done, to be polite." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mnt: release locks on error path in do_loopback palinfo fixes procfs: add proc_remove_subtree() ecryptfs: close rmmod race
2013-04-09prominfo_proc fixesAl Viro1-30/+12
* check for proc_mkdir() failures * use remove_proc_subtree() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09procfs: new helper - PDE_DATA(inode)Al Viro1-13/+5
The only part of proc_dir_entry the code outside of fs/proc really cares about is PDE(inode)->data. Provide a helper for that; static inline for now, eventually will be moved to fs/proc, along with the knowledge of struct proc_dir_entry layout. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-09palinfo fixesAl Viro1-64/+13
* check for proc_mkdir() failures * fix buffer overrun - sizeof(format string) is *not* enough to hold sprintf() result. * use proc_remove_subtree(); life's much easier with it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-04-08ia64: Use generic idle loopThomas Gleixner4-72/+23
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/20130321215234.406851909@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-08arch: Consolidate tsk_is_polling()Thomas Gleixner1-2/+0
Move it to a common place. Preparatory patch for implementing set/clear for the idle need_resched poll implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Magnus Damm <magnus.damm@gmail.com> Link: http://lkml.kernel.org/r/20130321215233.446034505@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-02Fix build error for numa_clear_node() under IA64Yijing Wang2-3/+7
numa_clear_node() function is not implemented under IA64, it will be called in unmap_cpu_on_node() in mm/memory_hotplug.c. This cause build error under IA64, this patch adds numa_clear_node() in IA64 to fix this problem. [Added __cpuinit notation to numa_clear_node() to keep linker happy -Tony] Signed-off-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-04-02Fix initialization of CMCI/CMCP interruptsTony Luck3-13/+33
Back 2010 during a revamp of the irq code some initializations were moved from ia64_mca_init() to ia64_mca_late_init() in commit c75f2aa13f5b268aba369b5dc566088b5194377c Cannot use register_percpu_irq() from ia64_mca_init() But this was hideously wrong. First of all these initializations are now down far too late. Specifically after all the other cpus have been brought up and initialized their own CMC vectors from smp_callin(). Also ia64_mca_late_init() may be called from any cpu so the line: ia64_mca_cmc_vector_setup(); /* Setup vector on BSP */ is generally not executed on the BSP, and so the CMC vector isn't setup at all on that processor. Make use of the arch_early_irq_init() hook to get this code executed at just the right moment: not too early, not too late. Reported-by: Fred Hartnett <fred.hartnett@hp.com> Tested-by: Fred Hartnett <fred.hartnett@hp.com> Cc: stable@kernel.org # v2.6.37+ Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-04-02cpufreq: Notify all policy->cpus in cpufreq_notify_transition()Viresh Kumar1-10/+12
policy->cpus contains all online cpus that have single shared clock line. And their frequencies are always updated together. Many SMP system's cpufreq drivers take care of this in individual drivers but the best place for this code is in cpufreq core. This patch modifies cpufreq_notify_transition() to notify frequency change for all cpus in policy->cpus and hence updates all users of this API. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Stephen Warren <swarren@nvidia.com> Tested-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-04-01Merge 3.9-rc5 into tty-nextGreg Kroah-Hartman1-4/+1
We need the fixes here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+1
Conflicts: net/mac80211/sta_info.c net/wireless/core.h Two minor conflicts in wireless. Overlapping additions of extern declarations in net/wireless/core.h and a bug fix overlapping with the addition of a boolean parameter to __ieee80211_key_free(). Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-31net: add option to enable error queue packets waking selectKeller, Jacob E1-0/+2
Currently, when a socket receives something on the error queue it only wakes up the socket on select if it is in the "read" list, that is the socket has something to read. It is useful also to wake the socket if it is in the error list, which would enable software to wait on error queue packets without waking up for regular data on the socket. The main use case is for receiving timestamped transmit packets which return the timestamp to the socket via the error queue. This enables an application to select on the socket for the error queue only instead of for the regular traffic. -v2- * Added the SO_SELECT_ERR_QUEUE socket option to every architechture specific file * Modified every socket poll function that checks error queue Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeffrey Kirsher <jeffrey.t.kirsher@intel.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Matthew Vick <matthew.vick@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-03-29ia64 idle: delete stale (*idle)() function pointerLen Brown1-4/+1
Commit 3e7fc708eb41 ("ia64 idle: delete pm_idle") in 3.9-rc1 didn't finish the job, leaving an un-initialized reference to (*idle)(). [ Haven't seen a crash from this - but seems like we are just being lucky that "idle" is zero so it does get initialized before we jump to randomland - Len ] Reported-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-21Merge 3.9-rc3 into tty-nextGreg Kroah-Hartman1-1/+1
2013-03-21Merge remote-tracking branch 'upstream/master' into queueMarcelo Tosatti2-1/+2
Merge reason: From: Alexander Graf <agraf@suse.de> "Just recently this really important patch got pulled into Linus' tree for 3.9: commit 1674400aaee5b466c595a8fc310488263ce888c7 Author: Anton Blanchard <anton <at> samba.org> Date: Tue Mar 12 01:51:51 2013 +0000 Without that commit, I can not boot my G5, thus I can't run automated tests on it against my queue. Could you please merge kvm/next against linus/master, so that I can base my trees against that?" * upstream/master: (653 commits) PCI: Use ROM images from firmware only if no other ROM source available sparc: remove unused "config BITS" sparc: delete "if !ULTRA_HAS_POPULATION_COUNT" KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798) KVM: x86: Convert MSR_KVM_SYSTEM_TIME to use gfn_to_hva_cache functions (CVE-2013-1797) KVM: x86: fix for buffer overflow in handling of MSR_KVM_SYSTEM_TIME (CVE-2013-1796) arm64: Kconfig.debug: Remove unused CONFIG_DEBUG_ERRORS arm64: Do not select GENERIC_HARDIRQS_NO_DEPRECATED inet: limit length of fragment queue hash table bucket lists qeth: Fix scatter-gather regression qeth: Fix invalid router settings handling qeth: delay feature trace sgy-cts1000: Remove __dev* attributes KVM: x86: fix deadlock in clock-in-progress request handling KVM: allow host header to be included even for !CONFIG_KVM hwmon: (lm75) Fix tcn75 prefix hwmon: (lm75.h) Update header inclusion MAINTAINERS: Remove Mark M. Hoffman xfs: ensure we capture IO errors correctly xfs: fix xfs_iomap_eof_prealloc_initial_size type ... Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2013-03-19Change "select DMAR" to "select INTEL_IOMMU"Paul Bolle1-1/+1
Commit d3f138106b ("iommu: Rename the DMAR and INTR_REMAP config options") changed all references to DMAR in Kconfig files to INTEL_IOMMU (and, likewise, changed the references to CONFIG_DMAR everywhere else to CONFIG_INTEL_IOMMU). That commit missed one "select DMAR" statement in ia64's Kconfig file. Change that one too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19Wrong asm register contraints in the kvm implementationStephan Schreiber1-1/+1
The Linux Kernel contains some inline assembly source code which has wrong asm register constraints in arch/ia64/kvm/vtlb.c. I observed this on Kernel 3.2.35 but it is also true on the most recent Kernel 3.9-rc1. File arch/ia64/kvm/vtlb.c: u64 guest_vhpt_lookup(u64 iha, u64 *pte) { u64 ret; struct thash_data *data; data = __vtr_lookup(current_vcpu, iha, D_TLB); if (data != NULL) thash_vhpt_insert(current_vcpu, data->page_flags, data->itir, iha, D_TLB); asm volatile ( "rsm psr.ic|psr.i;;" "srlz.d;;" "ld8.s r9=[%1];;" "tnat.nz p6,p7=r9;;" "(p6) mov %0=1;" "(p6) mov r9=r0;" "(p7) extr.u r9=r9,0,53;;" "(p7) mov %0=r0;" "(p7) st8 [%2]=r9;;" "ssm psr.ic;;" "srlz.d;;" "ssm psr.i;;" "srlz.d;;" : "=r"(ret) : "r"(iha), "r"(pte):"memory"); return ret; } The list of output registers is : "=r"(ret) : "r"(iha), "r"(pte):"memory"); The constraint "=r" means that the GCC has to maintain that these vars are in registers and contain valid info when the program flow leaves the assembly block (output registers). But "=r" also means that GCC can put them in registers that are used as input registers. Input registers are iha, pte on the example. If the predicate p7 is true, the 8th assembly instruction "(p7) mov %0=r0;" is the first one which writes to a register which is maintained by the register constraints; it sets %0. %0 means the first register operand; it is ret here. This instruction might overwrite the %2 register (pte) which is needed by the next instruction: "(p7) st8 [%2]=r9;;" Whether it really happens depends on how GCC decides what registers it uses and how it optimizes the code. The attached patch fixes the register operand constraints in arch/ia64/kvm/vtlb.c. The register constraints should be : "=&r"(ret) : "r"(iha), "r"(pte):"memory"); The & means that GCC must not use any of the input registers to place this output register in. This is Debian bug#702639 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702639). The patch is applicable on Kernel 3.9-rc1, 3.2.35 and many other versions. Signed-off-by: Stephan Schreiber <info@fs-driver.org> Cc: stable@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19Wrong asm register contraints in the futex implementationStephan Schreiber1-3/+2
The Linux Kernel contains some inline assembly source code which has wrong asm register constraints in arch/ia64/include/asm/futex.h. I observed this on Kernel 3.2.23 but it is also true on the most recent Kernel 3.9-rc1. File arch/ia64/include/asm/futex.h: static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval) { if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) return -EFAULT; { register unsigned long r8 __asm ("r8"); unsigned long prev; __asm__ __volatile__( " mf;; \n" " mov %0=r0 \n" " mov ar.ccv=%4;; \n" "[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n" " .xdata4 \"__ex_table\", 1b-., 2f-. \n" "[2:]" : "=r" (r8), "=r" (prev) : "r" (uaddr), "r" (newval), "rO" ((long) (unsigned) oldval) : "memory"); *uval = prev; return r8; } } The list of output registers is : "=r" (r8), "=r" (prev) The constraint "=r" means that the GCC has to maintain that these vars are in registers and contain valid info when the program flow leaves the assembly block (output registers). But "=r" also means that GCC can put them in registers that are used as input registers. Input registers are uaddr, newval, oldval on the example. The second assembly instruction " mov %0=r0 \n" is the first one which writes to a register; it sets %0 to 0. %0 means the first register operand; it is r8 here. (The r0 is read-only and always 0 on the Itanium; it can be used if an immediate zero value is needed.) This instruction might overwrite one of the other registers which are still needed. Whether it really happens depends on how GCC decides what registers it uses and how it optimizes the code. The objdump utility can give us disassembly. The futex_atomic_cmpxchg_inatomic() function is inline, so we have to look for a module that uses the funtion. This is the cmpxchg_futex_value_locked() function in kernel/futex.c: static int cmpxchg_futex_value_locked(u32 *curval, u32 __user *uaddr, u32 uval, u32 newval) { int ret; pagefault_disable(); ret = futex_atomic_cmpxchg_inatomic(curval, uaddr, uval, newval); pagefault_enable(); return ret; } Now the disassembly. At first from the Kernel package 3.2.23 which has been compiled with GCC 4.4, remeber this Kernel seemed to work: objdump -d linux-3.2.23/debian/build/build_ia64_none_mckinley/kernel/futex.o 0000000000000230 <cmpxchg_futex_value_locked>: 230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;; 236: 80 40 0d 00 42 00 adds r8=40,r3 23c: 00 00 04 00 nop.i 0x0;; 240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];; 246: 90 08 28 00 42 00 adds r9=1,r10 24c: 00 00 04 00 nop.i 0x0;; 250: 09 00 00 00 01 00 [MMI] nop.m 0x0 256: 00 48 20 20 23 00 st4 [r8]=r9 25c: 00 00 04 00 nop.i 0x0;; 260: 08 10 80 06 00 21 [MMI] adds r2=32,r3 266: 00 00 00 02 00 00 nop.m 0x0 26c: 02 08 f1 52 extr.u r16=r33,0,61 270: 05 40 88 00 08 e0 [MLX] addp4 r8=r34,r0 276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;; 27c: f1 f7 ff 65 280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2] 286: 00 00 00 02 00 c0 nop.m 0x0 28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;; 290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14 296: 00 00 00 02 00 40 nop.m 0x0 29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33 2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0 <cmpxchg_futex_value_locked+0x80> 2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2b0: 0a 00 00 00 22 00 [MMI] mf;; 2b6: 80 00 00 00 42 00 mov r8=r0 2bc: 00 00 04 00 nop.i 0x0 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; 2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv 2cc: 00 00 04 00 nop.i 0x0;; 2d0: 10 00 84 40 90 11 [MIB] st4 [r32]=r33 2d6: 00 00 00 02 00 00 nop.i 0x0 2dc: 20 00 00 40 br.few 2f0 <cmpxchg_futex_value_locked+0xc0> 2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14 2e6: 00 00 00 02 00 00 nop.m 0x0 2ec: 00 00 04 00 nop.i 0x0;; 2f0: 0b 58 20 1a 19 21 [MMI] adds r11=3208,r13;; 2f6: 20 01 2c 20 20 00 ld4 r18=[r11] 2fc: 00 00 04 00 nop.i 0x0;; 300: 0b 88 fc 25 3f 23 [MMI] adds r17=-1,r18;; 306: 00 88 2c 20 23 00 st4 [r11]=r17 30c: 00 00 04 00 nop.i 0x0;; 310: 11 00 00 00 01 00 [MIB] nop.m 0x0 316: 00 00 00 02 00 80 nop.i 0x0 31c: 08 00 84 00 br.ret.sptk.many b0;; The lines 2b0: 0a 00 00 00 22 00 [MMI] mf;; 2b6: 80 00 00 00 42 00 mov r8=r0 2bc: 00 00 04 00 nop.i 0x0 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; 2c6: 10 1a 85 22 20 00 cmpxchg4.acq r33=[r33],r35,ar.ccv 2cc: 00 00 04 00 nop.i 0x0;; are the instructions of the assembly block. The line 2b6: 80 00 00 00 42 00 mov r8=r0 sets the r8 register to 0 and after that 2c0: 0b 00 20 40 2a 04 [MMI] mov.m ar.ccv=r8;; prepares the 'oldvalue' for the cmpxchg but it takes it from r8. This is wrong. What happened here is what I explained above: An input register is overwritten which is still needed. The register operand constraints in futex.h are wrong. (The problem doesn't occur when the Kernel is compiled with GCC 4.6.) The attached patch fixes the register operand constraints in futex.h. The code after patching of it: static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval, u32 newval) { if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32))) return -EFAULT; { register unsigned long r8 __asm ("r8") = 0; unsigned long prev; __asm__ __volatile__( " mf;; \n" " mov ar.ccv=%4;; \n" "[1:] cmpxchg4.acq %1=[%2],%3,ar.ccv \n" " .xdata4 \"__ex_table\", 1b-., 2f-. \n" "[2:]" : "+r" (r8), "=&r" (prev) : "r" (uaddr), "r" (newval), "rO" ((long) (unsigned) oldval) : "memory"); *uval = prev; return r8; } } I also initialized the 'r8' var with the C programming language. The _asm qualifier on the definition of the 'r8' var forces GCC to use the r8 processor register for it. I don't believe that we should use inline assembly for zeroing out a local variable. The constraint is "+r" (r8) what means that it is both an input register and an output register. Note that the page fault handler will modify the r8 register which will be the return value of the function. The real fix is "=&r" (prev) The & means that GCC must not use any of the input registers to place this output register in. Patched the Kernel 3.2.23 and compiled it with GCC4.4: 0000000000000230 <cmpxchg_futex_value_locked>: 230: 0b 18 80 1b 18 21 [MMI] adds r3=3168,r13;; 236: 80 40 0d 00 42 00 adds r8=40,r3 23c: 00 00 04 00 nop.i 0x0;; 240: 0b 50 00 10 10 10 [MMI] ld4 r10=[r8];; 246: 90 08 28 00 42 00 adds r9=1,r10 24c: 00 00 04 00 nop.i 0x0;; 250: 09 00 00 00 01 00 [MMI] nop.m 0x0 256: 00 48 20 20 23 00 st4 [r8]=r9 25c: 00 00 04 00 nop.i 0x0;; 260: 08 10 80 06 00 21 [MMI] adds r2=32,r3 266: 20 12 01 10 40 00 addp4 r34=r34,r0 26c: 02 08 f1 52 extr.u r16=r33,0,61 270: 05 40 00 00 00 e1 [MLX] mov r8=r0 276: ff ff 0f 00 00 e0 movl r15=0xfffffffbfff;; 27c: f1 f7 ff 65 280: 09 70 00 04 18 10 [MMI] ld8 r14=[r2] 286: 00 00 00 02 00 c0 nop.m 0x0 28c: f0 80 1c d0 cmp.ltu p6,p7=r15,r16;; 290: 08 40 fc 1d 09 3b [MMI] cmp.eq p8,p9=-1,r14 296: 00 00 00 02 00 40 nop.m 0x0 29c: e1 08 2d d0 cmp.ltu p10,p11=r14,r33 2a0: 56 01 10 00 40 10 [BBB] (p10) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2a6: 02 08 00 80 21 03 (p08) br.cond.dpnt.few 2b0 <cmpxchg_futex_value_locked+0x80> 2ac: 40 00 00 41 (p06) br.cond.spnt.few 2e0 <cmpxchg_futex_value_locked+0xb0> 2b0: 0b 00 00 00 22 00 [MMI] mf;; 2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34 2bc: 00 00 04 00 nop.i 0x0;; 2c0: 09 58 8c 42 11 10 [MMI] cmpxchg4.acq r11=[r33],r35,ar.ccv 2c6: 00 00 00 02 00 00 nop.m 0x0 2cc: 00 00 04 00 nop.i 0x0;; 2d0: 10 00 2c 40 90 11 [MIB] st4 [r32]=r11 2d6: 00 00 00 02 00 00 nop.i 0x0 2dc: 20 00 00 40 br.few 2f0 <cmpxchg_futex_value_locked+0xc0> 2e0: 09 40 c8 f9 ff 27 [MMI] mov r8=-14 2e6: 00 00 00 02 00 00 nop.m 0x0 2ec: 00 00 04 00 nop.i 0x0;; 2f0: 0b 88 20 1a 19 21 [MMI] adds r17=3208,r13;; 2f6: 30 01 44 20 20 00 ld4 r19=[r17] 2fc: 00 00 04 00 nop.i 0x0;; 300: 0b 90 fc 27 3f 23 [MMI] adds r18=-1,r19;; 306: 00 90 44 20 23 00 st4 [r17]=r18 30c: 00 00 04 00 nop.i 0x0;; 310: 11 00 00 00 01 00 [MIB] nop.m 0x0 316: 00 00 00 02 00 80 nop.i 0x0 31c: 08 00 84 00 br.ret.sptk.many b0;; Much better. There is a 270: 05 40 00 00 00 e1 [MLX] mov r8=r0 which was generated by C code r8 = 0. Below 2b6: 00 10 81 54 08 00 mov.m ar.ccv=r34 what means that oldval is no longer overwritten. This is Debian bug#702641 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=702641). The patch is applicable on Kernel 3.9-rc1, 3.2.23 and many other versions. Signed-off-by: Stephan Schreiber <info@fs-driver.org> Cc: stable@vger.kernel.org Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19Remove cast for kmalloc return valueZhang Yanfei1-1/+1
remove cast for kmalloc return value. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19Fix kexec oops when iosapic was removedHanjun Guo1-1/+31
Iosapic hotplug was supported in IA64 code, but will lead to kexec oops when iosapic was removed. here is the code logic: iosapic_remove iosapic_free memset(&iosapic_lists[index], 0, sizeof(iosapic_lists[0])) iosapic_lists[index].addr was set to 0; and then kexec a new kernel kexec_disable_iosapic iosapic_write(rte->iosapic,..) __iosapic_write(iosapic->addr, reg, val); addr was set to 0 when iosapic_remove, and oops happened The call trace is: Starting new kernel kexec[11336]: Oops 8804682956800 [1] Modules linked in: raw(N) ipv6(N) acpi_cpufreq(N) binfmt_misc(N) fuse(N) nls_iso 8859_1(N) loop(N) ipmi_si(N) ipmi_devintf(N) ipmi_msghandler(N) mca_ereport(N) s csi_ereport(N) nic_ereport(N) pcie_ereport(N) err_transport(N) nvlist(PN) dm_mod (N) tpm_tis(N) tpm(N) ppdev(N) tpm_bios(N) serio_raw(N) i2c_i801(N) iTCO_wdt(N) i2c_core(N) iTCO_vendor_support(N) sg(N) ioatdma(N) igb(N) mptctl(N) dca(N) parp ort_pc(N) parport(N) container(N) button(N) usbhid(N) hid(N) uhci_hcd(N) ehci_hc d(N) usbcore(N) sd_mod(N) crc_t10dif(N) ext3(N) mbcache(N) jbd(N) fan(N) process or(N) ide_pci_generic(N) ide_core(N) ata_piix(N) libata(N) mptsas(N) mptscsih(N) mptbase(N) scsi_transport_sas(N) scsi_mod(N) thermal(N) thermal_sys(N) hwmon(N) Supported: Yes, External Pid: 11336, CPU 0, comm: kexec psr : 0000101009522030 ifs : 8000000000000791 ip : [<a00000010004c160>] Tain ted: P N (2.6.32.12_RAS_V1R3C00B011) ip is at kexec_disable_iosapic+0x120/0x1e0 unat: 0000000000000000 pfs : 0000000000000791 rsc : 0000000000000003 rnat: 0000000000000000 bsps: 0000000000000000 pr : 65519aa6a555a659 ldrs: 0000000000000000 ccv : 00000000ea3cf51e fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a00000010004c150 b6 : a000000100012620 b7 : a00000010000cda0 f6 : 000000000000000000000 f7 : 1003e0000000002000000 f8 : 1003e0000000050000003 f9 : 1003e0000028fb97183cd f10 : 1003ee9f380df3c548b67 f11 : 1003e00000000000000cc r1 : a0000001016cf660 r2 : 0000000000000000 r3 : 0000000000000000 r8 : 0000001009526030 r9 : a000000100012620 r10 : e00000010053f600 r11 : c0000000fec34040 r12 : e00000078f76fd30 r13 : e00000078f760000 r14 : 0000000000000000 r15 : 0000000000000000 r16 : 0000000000000000 r17 : 0000000000000000 r18 : 0000000000007fff r19 : 0000000000000000 r20 : 0000000000000000 r21 : e00000010053f590 r22 : a000000100cf0000 r23 : 0000000000000036 r24 : e0000007002f8a84 r25 : 0000000000000022 r26 : e0000007002f8a88 r27 : 0000000000000020 r28 : 0000000000000002 r29 : a0000001012c8c60 r30 : 0000000000000000 r31 : 0000000000322e49 Call Trace: [<a000000100018ca0>] show_stack+0x80/0xa0 sp=e00000078f76f8f0 bsp=e00000078f761380 [<a000000100019300>] show_regs+0x640/0x920 sp=e00000078f76fac0 bsp=e00000078f761328 [<a00000010002a130>] die+0x190/0x2e0 sp=e00000078f76fad0 bsp=e00000078f7612e8 [<a000000100922fa0>] ia64_do_page_fault+0x840/0xb20 sp=e00000078f76fad0 bsp=e00000078f761288 [<a00000010000d5c0>] ia64_native_leave_kernel+0x0/0x270 sp=e00000078f76fb60 bsp=e00000078f761288 [<a00000010004c160>] kexec_disable_iosapic+0x120/0x1e0 sp=e00000078f76fd30 bsp=e00000078f761200 [<a000000100016970>] machine_shutdown+0x110/0x140 sp=e00000078f76fd30 bsp=e00000078f7611c8 [<a000000100133530>] kernel_kexec+0xd0/0x120 sp=e00000078f76fd30 bsp=e00000078f7611a0 [<a0000001000eca40>] sys_reboot+0x480/0x4e0 sp=e00000078f76fd30 bsp=e00000078f761128 [<a00000010000d420>] ia64_ret_from_syscall+0x0/0x20 sp=e00000078f76fe30 bsp=e00000078f761120 Kernel panic - not syncing: Fatal exception With Tony and Toshi's advice, the patch removes the "rte" from rte_list when the iosapic was removed. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Jianguo Wu <wujianguo@huawei.com> Acked-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19iosapic: fix a minor typo in commentsHanjun Guo1-1/+1
describeinterrupts -> describe interrupts Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-03-19Add WB/UC check for early_ioremapLi, Zhen-Hua1-5/+9
On ia64 system, the function early_ioremap returned an uncached memory reference without checking whether this was consistent with existing mappings. This causes efi error and the kernel failed during boot. Add a check to test whether memory has EFI_MEMORY_WB set. Use the function kern_mem_attribute() in early_iomap() function to provide appropriate cacheable or uncacheable mapped address. See the document Documentation/ia64/aliasing.txt for more details. Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com> Signed-off-by: Tony Luck <tony.luck@intel.com>