2007-08-12Linux 2.6.23-rc3v2.6.23-rc3Linus Torvalds1-1/+1
* git:// sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/ sched: fix sleeper bonus sched: make global code static
2007-08-12genirq: mark io_apic level interrupts to avoid resendThomas Gleixner2-4/+10
Level type interrupts do not need to be resent. It was also found that some chipsets get confused in case of the resend. Mark the ioapic level type interrupts as such to avoid the resend functionality in the generic irq code. Signed-off-by: Thomas Gleixner <> Signed-off-by: Linus Torvalds <>
2007-08-12genirq: suppress resend of level interruptsThomas Gleixner1-1/+6
Level type interrupts are resent by the interrupt hardware when they are still active at irq_enable(). Suppress the resend mechanism for interrupts marked as level. Signed-off-by: Thomas Gleixner <> Signed-off-by: Linus Torvalds <>
2007-08-12genirq: cleanup mismerge artifactThomas Gleixner1-4/+1
Commit 5a43a066b11ac2fe84cf67307f20b83bea390f83: "genirq: Allow fasteoi handler to retrigger disabled interrupts" was erroneously applied to handle_level_irq(). This added the irq retrigger / resend functionality to the level irq handler. Revert the offending bits. Signed-off-by: Thomas Gleixner <> Signed-off-by: Linus Torvalds <>
2007-08-12sched: run_rebalance_domains: s/SCHED_IDLE/CPU_IDLE/Oleg Nesterov1-1/+1
rebalance_domains(SCHED_IDLE) looks strange (typo), change it to CPU_IDLE. the effect of this bug was slightly more agressive idle-balancing on SMP than intended. Signed-off-by: Oleg Nesterov <> Signed-off-by: Ingo Molnar <>
2007-08-12sched: fix sleeper bonusIngo Molnar1-6/+6
Peter Ziljstra noticed that the sleeper bonus deduction code was not properly rate-limited: a task that scheduled more frequently would get a disproportionately large deduction. So limit the deduction to delta_exec. Signed-off-by: Ingo Molnar <>
2007-08-12sched: make global code staticAdrian Bunk2-25/+23
This patch makes the following needlessly global code static: - arch_reinit_sched_domains() - struct attr_sched_mc_power_savings - struct attr_sched_smt_power_savings Signed-off-by: Adrian Bunk <> Signed-off-by: Andrew Morton <> Signed-off-by: Ingo Molnar <>
2007-08-12i386: Fix broken mmiocfg accessesLinus Torvalds1-3/+3
Commit 3320ad994afb2c44ad34b3b34c3c5cf0da297331 broke mmio config space accesses totally on i386 - it dropped the "reg" offset to the address. Cc: dean gaudet <> Cc: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-12Do not replace whole memcpy in apply alternativesPetr Vandrovec1-1/+3
apply_alternatives uses memcpy() to apply alternatives. Which has the unfortunate effect that while applying memcpy alternative to memcpy itself it tries to overwrite itself with nops - which causes #UD fault as it overwrites half of an instruction in copy loop, and from this point on only possible outcome is triplefault and reboot. So let's overwrite only first two instructions of memcpy - as long as the main memcpy loop is not in first two bytes it will work fine. Signed-off-by: Petr Vandrovec <> Signed-off-by: Linus Torvalds <>
2007-08-12ACPI: thermal: add DMI hooks to handle AOpen's broken Award BIOSLen Brown1-0/+65
Use DMI to: 1. enable polling (BIOS thermal events are broken) 2. disable active trip points (BIOS fan control is broken) 3. disable passive trip point (BIOS hard-codes it too low) The actual temperature reading does work, and with the aid of polling, the critical trip point should work too. Signed-off-by: Len Brown <>
2007-08-12ACPI: thermal: create "thermal.act=" to disable or override active trip pointLen Brown2-4/+34
thermal.act=-1 disables all active trip points in all ACPI thermal zones. thermal.act=C, where C > 0, overrides all lowest temperature active trip points in all thermal zones to C degrees Celsius. Raising this trip-point may allow you to keep your system silent up to a higher temperature. However, it will not allow you to raise the lowest temperature trip point above the next higher trip point (if there is one). Lowering this trip point may kick in the fan sooner. Note that overriding this trip-point will disable any BIOS attempts to implement hysteresis around the lowest temperature trip point. This may result in the fan starting and stopping frequently if temperature frequently crosses C. WARNING: raising trip points above the manufacturer's defaults may cause the system to run at higher temperature and shorten its life. Signed-off-by: Len Brown <>
2007-08-12ACPI: thermal: create "thermal.nocrt" to disable critical actionsLen Brown2-6/+16
thermal.nocrt=1 disables actions on _CRT and _HOT ACPI thermal zone trip-points. They will be marked as <disabled> in /proc/acpi/thermal_zone/*/trip_points. There are two cases where this option is used: 1. Debugging a hot system crossing valid trip point. If your system fan is spinning at full speed, be sure that the vent is not clogged with dust. Many laptops have very fine thermal fins that are easily blocked. Check that the processor fan-sink is properly seated, has the proper thermal grease, and is really spinning. Check for fan related options in BIOS SETUP. Sometimes there is a performance vs quiet option. Defaults are generally the most conservative. If your fan is not spinning, yet /proc/acpi/fan/ has files in it, please file a Linux/ACPI bug. WARNING: you risk shortening the lifetime of your hardware if you use this parameter on a hot system. Note that this refers to all system components, including the disk drive. 2. Working around a cool system crossing critical trip point due to erroneous temperature reading. Try again with CONFIG_HWMON=n There is known potential for conflict between the the hwmon sub-system and the ACPI BIOS. If this fixes it, notify and Otherwise, file a Linux/ACPI bug, or notify just Signed-off-by: Len Brown <>
2007-08-12ACPI: thermal: create "thermal.psv=" to override passive trip pointsLen Brown2-3/+18
"thermal.psv=-1" disables passive trip points for all ACPI thermal zones. "thermal.psv=C", where 'C' is degrees Celsius, overrides all existing passive trip points for all ACPI thermal zones. thermal.psv is checked at module load time, and in response to trip-point change events. Note that if the system does not deliver thermal zone temperature change events near the new trip-point, then it will not be noticed. To force your custom trip point to be noticed, you may need to enable polling: eg. thermal.tzp=3000 invokes polling every 5 minutes. Note that once passive thermal throttling is invoked, it has its own internal Thermal Sampling Period (_TSP), that is unrelated to _TZP. WARNING: disabling or raising a thermal trip point may result in increased running temperature and shorter hardware lifetime on some systems. Signed-off-by: Len Brown <>
2007-08-12ACPI: thermal: expose "thermal.tzp=" to set global polling frequencyLen Brown2-1/+6
Thermal Zone Polling frequency (_TZP) is an optional ACPI object recommending the rate that the OS should poll the associated thermal zone. If _TZP is 0, no polling should be used. If _TZP is non-zero, then the platform recommends that the OS poll the thermal zone at the specified rate. The minimum period is 30 seconds. The maximum period is 5 minutes. (note _TZP and thermal.tzp units are in deci-seconds, so _TZP = 300 corresponds to 30 seconds) If _TZP is not present, ACPI 3.0b recommends that the thermal zone be polled at an "OS provided default frequency". However, common industry practice is: 1. The BIOS never specifies any _TZP 2. High volume OS's from this century never poll any thermal zones Ie. The OS depends on the platform's ability to provoke thermal events when necessary, and the "OS provided default frequency" is "never":-) There is a proposal that ACPI 4.0 be updated to reflect common industry practice -- ie. no _TZP, no polling. The Linux kernel already follows this practice -- thermal zones are not polled unless _TZP is present and non-zero. But thermal zone polling is useful as a workaround for systems which have ACPI thermal control, but have an issue preventing thermal events. Indeed, some Linux distributions still set a non-zero thermal polling frequency for this reason. But rather than ask the user to write a polling frequency into all the /proc/acpi/thermal_zone/*/polling_frequency files, here we simply document and expose the already existing module parameter to do the same at system level, to simplify debugging those broken platforms. Note that thermal.tzp is a module-load time parameter only. Signed-off-by: Len Brown <>
2007-08-12ACPI: thermal: create "" to disable ACPI thermal supportLen Brown2-1/+11
"" disables all ACPI thermal support at boot time. CONFIG_ACPI_THERMAL=n can do this at build time. "# rmmod thermal" can do this at run time, as long as thermal is built as a module. WARNING: On some systems, disabling ACPI thermal support will cause the system to run hotter and reduce the lifetime of the hardware. Signed-off-by: Len Brown <>
2007-08-11ACPI: thinkpad-acpi: fix sysfs paths in documentationHenrique de Moraes Holschuh1-2/+2
The documentation used "thinkpad-acpi" to refer to the directories in sysfs, while it should have been using "thinkpad_acpi". Thanks to Hugh Dickins for the error report. I wish I could just call the module and everything else by the proper name with the "-", instead of using these ugly translations to "_". Signed-off-by: Henrique de Moraes Holschuh <> Cc: Hugh Dickins <> Signed-off-by: Len Brown <>
2007-08-11ACPI: staticAdrian Bunk1-1/+1
Make the needlessly global "acpi_event_seqnum" static. Signed-off-by: Adrian Bunk <> Signed-off-by: Andrew Morton <> Signed-off-by: Len Brown <>
2007-08-11ACPI EC: remove potential deadlock from ECAlexey Starikovskiy1-2/+0
Signed-off-by: Alexey Starikovskiy <> Signed-off-by: Andrew Morton <> Signed-off-by: Len Brown <>
2007-08-11ACPI: dock: Send key=value pair instead of plain valueHolger Macht1-3/+3
Send key=value pair along with the uevent instead of a plain value so that userspace (udev) can handle it like common environment variables. Signed-off-by: Holger Macht <> Acked-by: Kristen Carlson Accardi <> Cc: Stephan Berberig <> Signed-off-by: Andrew Morton <> Acked-by: Greg Kroah-Hartman <> Signed-off-by: Len Brown <>
2007-08-11ACPI: bay: send envp with uevent - fixStephan Berberig1-1/+1
There must not be a new-line character in the uevent. Otherwise, udev gets confused. Thanks to Kay Sievers for pointing it out. Signed-off-by: Stephan Berberig <> Cc: Kristen Carlson Accardi <> Signed-off-by: Andrew Morton <> Acked-by: Greg Kroah-Hartman <> Signed-off-by: Len Brown <>
2007-08-11i386: Fix double fault handlerChuck Ebbert1-6/+7
The new percpu code has apparently broken the doublefault handler when CONFIG_DEBUG_SPINLOCK is set. Doublefault is handled by a hardware task, making the check SPIN_BUG_ON(lock->owner == current, lock, "recursion"); fault because it uses the FS register to access the percpu data for current, and that register is zero in the new TSS. (The trace I saw was on 2.6.20 where it was GS, but it looks like this will still happen with FS on 2.6.22.) Initializing FS in the doublefault_tss should fix it. AK: Also fix broken ptr_ok() and turn printks into KERN_EMERG AK: And add a PANIC prefix to make clear the system will hang AK: (e.g. x86-64 will recover) Signed-off-by: Chuck Ebbert <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11i386: Fix start_kernel warningAndi Kleen1-3/+1
Fix WARNING: vmlinux.o(.text+0x183): Section mismatch: reference to .init.text:start_kernel (between 'is386' and 'check_x87') Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86_64: in arch/x86_64/vdso/.gitignorePete Zaitcev1-0/+1
Create arch/x86_64/vdso/.gitignore and put into it. Signed-off-by: Pete Zaitcev <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11i386: Add warning in Documentation that zero-page is not a stable ABIAndi Kleen1-0/+10
Some people writing boot loaders seem to falsely belief the 32bit zero page is a stable interface for out of tree code like the real mode boot protocol. Add a comment clarifying that is not true. Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11i386: Use global flag to disable broken local apic timer on AMD CPUs.Andi Kleen4-8/+13
The Averatec 2370 and some other Turion laptop BIOS seems to program the ENABLE_C1E MSR inconsistently between cores. This confuses the lapic use heuristics because when C1E is enabled anywhere it seems to affect the complete chip. Use a global flag instead of a per cpu flag to handle this. If any CPU has C1E enabled disabled lapic use. Thanks to Cal Peake for debugging. Cc: Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11i386: really stop MCEs during code patchingAdrian Bunk1-2/+2
It's CONFIG_X86_MCE, not CONFIG_MCE. Signed-off-by: Adrian Bunk <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86_64: Early segment setup for VTZachary Amsden1-0/+7
VT is very picky about when it can enter execution. Get all segments setup and get LDT and TR into valid state to allow VT execution under VMware and KVM (untested). This makes the boot decompression run under VT, which makes it several orders of magnitude faster on 64-bit Intel hardware. Before, I was seeing times up to a minute or more to decompress a 1.3MB kernel on a very fast box. Signed-off-by: Zachary Amsden <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11i386: Make patching more robust, fix paravirt issueAndi Kleen6-67/+90
Commit 19d36ccdc34f5ed444f8a6af0cbfdb6790eb1177 "x86: Fix alternatives and kprobes to remap write-protected kernel text" uses code which is being patched for patching. In particular, paravirt_ops does patching in two stages: first it calls paravirt_ops.patch, then it fills any remaining instructions with nop_out(). nop_out calls text_poke() which calls lookup_address() which calls pgd_val() (aka paravirt_ops.pgd_val): that call site is one of the places we patch. If we always do patching as one single call to text_poke(), we only need make sure we're not patching the memcpy in text_poke itself. This means the prototype to paravirt_ops.patch needs to change, to marshal the new code into a buffer rather than patching in place as it does now. It also means all patching goes through text_poke(), which is known to be safe (apply_alternatives is also changed to make a single patch). AK: fix compilation on x86-64 (bad rusty!) AK: fix boot on x86-64 (sigh) AK: merged with other patches Signed-off-by: Rusty Russell <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86: Disable CLFLUSH support againAndi Kleen2-2/+3
It turns out CLFLUSH support is still not complete; we flush the wrong pages. Again disable it for the release. Noticed by Jan Beulich who then also noticed a stupid typo later. Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86_64: Don't mark __exitcall as __coldAndi Kleen1-1/+1
gcc currently doesn't support attributes on types, so we can't use it function pointers. This avoids some warnings on a gcc 4.3 build. Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86_64: Calgary - Fix mis-handled PCI topologyMurillo Fernandes Bernardes1-7/+6
Current code assumed that devices were directly connected to a Calgary bridge, as it tried to get the iommu table directly from the parent bus controller. When we have another bridge between the Calgary/CalIOC2 bridge and the device we should look upwards until we get to the top (Calgary/CalIOC2 bridge), where the iommu table resides. Signed-off-by: Murillo Fernandes Bernardes <> Signed-off-by: Muli Ben-Yehuda <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11x86: Work around mmio config space quirk on AMD Fam10hdean gaudet3-14/+55
Some broken devices have been discovered to require %al/%ax/%eax registers for MMIO config space accesses. Modify mmconfig.c to use these registers explicitly (rather than modify the global readb/writeb/etc inlines). AK: also changed i386 to always use eax AK: moved change to extended space probing to different patch AK: reworked with inlines according to Linus' requirements. AK: improve comments. Signed-off-by: dean gaudet <> Signed-off-by: Andi Kleen <> Signed-off-by: Linus Torvalds <>
2007-08-11changing include/asm-generic/pgtable.h for non-mmuGreg Ungerer1-35/+38
There are some parts of include/asm-generic/pgtable.h that are relevant to the non-mmu architectures. To make it easier to include this from them I would like to ifdef the relevant parts. Without this there is a handful of functions that are referenced in here that are not defined on many non-mmu architectures. They could be defined out of course, as an alternative approach. Cc: David Howells <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>