Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull microblaze update from Michal Simek:
"Microblaze changes.
After my discussion with Arnd I have also added there asm-generic io
patch which is Acked by him and Geert."
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
asm-generic: io: Fix ioread16/32be and iowrite16/32be
microblaze: Do not use module.h in files which are not modules
microblaze: Fix coding style issues
microblaze: Add missing return from debugfs_tlb
microblaze: Makefile clean
microblaze: Add .gitignore entries for auto-generated files
microblaze: Fix strncpy_from_user macro
|
|
Pull OpenRISC updates from Jonas Bonn:
"An equal number of bug fixes and trivial cleanups; no new features.
- Two patches to fix errors thrown by the updated toolchain.
- Three other bug fixes.
- Four trivial cleanups."
* 'for-upstream' of git://openrisc.net/jonas/linux:
openrisc: add missing header inclusion
openrisc: really pass correct arg to schedule_tail
Add bitops include needed for ext2 filesystem
openrisc: update DTLB-miss handler last
openrisc: fix up vmalloc page table loading
openrisc idle: delete pm_idle
openrisc: remove CONFIG_SYMBOL_PREFIX
openrisc: avoid using function parameter regs in reset vector
openrisc: remove unused current_regs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge
Revert "x86, mm: Make spurious_fault check explicitly check explicitly check the PRESENT bit"
x86/mm/numa: Don't check if node is NUMA_NO_NODE
x86, efi: Make "noefi" really disable EFI runtime serivces
x86/apic: Fix parsing of the 'lapic' cmdline option
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Add Intel IvyBridge event scheduling constraints
ftrace: Call ftrace cleanup module notifier after all other notifiers
tracing/syscalls: Allow archs to ignore tracing compat syscalls
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU Updates from Joerg Roedel:
"Besides some fixes and cleanups in the code there are three more
important changes to point out this time:
* New IOMMU driver for the ARM SHMOBILE platform
* An IOMMU-API extension for non-paging IOMMUs (required for
upcoming PAMU driver)
* Rework of the way the Tegra IOMMU driver accesses its
registetrs - register windows are easier to extend now.
There are also a few changes to non-iommu code, but that is acked by
the respective maintainers."
* tag 'iommu-updates-v3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (23 commits)
iommu/tegra: assume CONFIG_OF in SMMU driver
iommu/tegra: assume CONFIG_OF in gart driver
iommu/amd: Remove redundant NULL check before dma_ops_domain_free().
iommu/amd: Initialize device table after dma_ops
iommu/vt-d: Zero out allocated memory in dmar_enable_qi
iommu/tegra: smmu: Fix incorrect mask for regbase
iommu/exynos: Make exynos_sysmmu_disable static
ARM: mach-shmobile: r8a7740: Add IPMMU device
ARM: mach-shmobile: sh73a0: Add IPMMU device
ARM: mach-shmobile: sh7372: Add IPMMU device
iommu/shmobile: Add iommu driver for Renesas IPMMU modules
iommu: Add DOMAIN_ATTR_WINDOWS domain attribute
iommu: Add domain window handling functions
iommu: Implement DOMAIN_ATTR_PAGING attribute
iommu: Check for valid pgsize_bitmap in iommu_map/unmap
iommu: Make sure DOMAIN_ATTR_MAX is really the maximum
iommu/tegra: smmu: Change SMMU's dependency on ARCH_TEGRA
iommu/tegra: smmu: Use helper function to check for valid register offset
iommu/tegra: smmu: Support variable MMIO ranges/blocks
iommu/tegra: Add missing spinlock initialization
...
|
|
git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping updates from Marek Szyprowski:
"This time all patches are related only to ARM DMA-mapping subsystem.
The main extension provided by this pull request is highmem support.
Besides that it contains a bunch of small bugfixes and cleanups."
* 'for-v3.9' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
ARM: DMA-mapping: fix memory leak in IOMMU dma-mapping implementation
ARM: dma-mapping: Add maximum alignment order for dma iommu buffers
ARM: dma-mapping: use himem for DMA buffers for IOMMU-mapped devices
ARM: dma-mapping: add support for CMA regions placed in highmem zone
arm: dma mapping: export arm iommu functions
ARM: dma-mapping: Add arm_iommu_detach_device()
ARM: dma-mapping: Add macro to_dma_iommu_mapping()
ARM: dma-mapping: Set arm_dma_set_mask() for iommu->set_dma_mask()
ARM: iommu: Include linux/kref.h in asm/dma-iommu.h
|
|
Pull GPIO changes from Grant Likely:
"This branch contains the usual set of individual driver improvements
and bug fixes, as well as updates to the core code. The more notable
changes include:
- Internally add new API for referencing GPIOs by gpio_desc instead
of number. Eventually this will become a public API
- ACPI GPIO binding support"
* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux: (33 commits)
arm64: select ARCH_WANT_OPTIONAL_GPIOLIB
gpio: em: Use irq_domain_add_simple() to fix runtime error
gpio: using common order: let 'static const' instead of 'const static'
gpio/vt8500: memory cleanup missing
gpiolib: Fix locking on gpio debugfs files
gpiolib: let gpio_chip reference its descriptors
gpiolib: use descriptors internally
gpiolib: use gpio_chips list in gpiochip_find_base
gpiolib: use gpio_chips list in sysfs ops
gpiolib: use gpio_chips list in gpiochip_find
gpiolib: use gpio_chips list in gpiolib_sysfs_init
gpiolib: link all gpio_chips using a list
gpio/langwell: cleanup driver
gpio/langwell: Add Cloverview ids to pci device table
gpio/lynxpoint: add chipset gpio driver.
gpiolib: add missing braces in gpio_direction_show
gpiolib-acpi: Fix error checks in interrupt requesting
gpio: mpc8xxx: don't set IRQ_TYPE_NONE when creating irq mapping
gpiolib: remove gpiochip_reserve()
arm: pxa: tosa: do not use gpiochip_reserve()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC update from Chris Ball:
"MMC highlights for 3.9:
Core:
- Support for packed commands in eMMC 4.5. (This requires a host
capability to be turned on. It increases write throughput by 20%+,
but may also increase average write latency; more testing needed.)
- Add DT bindings for capability flags.
- Add mmc_of_parse() for shared DT parsing between drivers.
Drivers:
- android-goldfish: New MMC driver for the Android Goldfish emulator.
- mvsdio: Add DT bindings, pinctrl, use slot-gpio for card detection.
- omap_hsmmc: Fix boot hangs with RPMB partitions.
- sdhci-bcm2835: New driver for controller used by Raspberry Pi.
- sdhci-esdhc-imx: Add 8-bit data, auto CMD23 support, use slot-gpio.
- sh_mmcif: Add support for eMMC DDR, bundled MMCIF IRQs.
- tmio_mmc: Add DT bindings, support for vccq regulator"
* tag 'mmc-updates-for-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (92 commits)
mmc: tegra: assume CONFIG_OF, remove platform data
mmc: add DT bindings for more MMC capability flags
mmc: tmio: add support for the VccQ regulator
mmc: tmio: remove unused and deprecated symbols
mmc: sh_mobile_sdhi: use managed resource allocations
mmc: sh_mobile_sdhi: remove unused .pdata field
mmc: tmio-mmc: parse device-tree bindings
mmc: tmio-mmc: define device-tree bindings
mmc: sh_mmcif: use mmc_of_parse() to parse standard MMC DT bindings
mmc: (cosmetic) remove "extern" from function declarations
mmc: provide a standard MMC device-tree binding parser centrally
mmc: detailed definition of CD and WP MMC line polarities in DT
mmc: sdhi, tmio: only check flags in tmio-mmc driver proper
mmc: sdhci: Fix parameter of sdhci_do_start_signal_voltage_switch()
mmc: sdhci: check voltage range only on regulators aware of voltage value
mmc: bcm2835: set SDHCI_QUIRK_DATA_TIMEOUT_USES_SDCLK
mmc: support packed write command for eMMC4.5 devices
mmc: add packed command feature of eMMC4.5
mmc: rtsx: remove driving adjustment
mmc: use regulator_can_change_voltage() instead of regulator_count_voltages
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds
Pull LED subsystem update from Bryan Wu.
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds: (61 commits)
leds: leds-sunfire: use dev_err()/pr_err() instead of printk()
leds: 88pm860x: Add missing of_node_put()
leds: tca6507: Use of_get_child_count()
leds: leds-pwm: make it depend on PWM and not HAVE_PWM
Documentation: leds: update LP55xx family devices
leds-lp55xx: fix problem on removing LED attributes
leds-lp5521/5523: add author and copyright description
leds-lp5521/5523: use new lp55xx common header
leds-lp55xx: clean up headers
leds-lp55xx: clean up definitions
leds-lp55xx: clean up unused data and functions
leds-lp55xx: clean up _remove()
leds-lp55xx: add new function for removing device attribtues
leds-lp55xx: code refactoring on selftest function
leds-lp55xx: use common device attribute driver function
leds-lp55xx: support device specific attributes
leds-lp5523: use generic firmware interface
leds-lp5521: use generic firmware interface
leds-lp55xx: support firmware interface
leds-lp55xx: add new lp55xx_register_sysfs() for the firmware interface
...
|
|
Pull slave-dmaengine updates from Vinod Koul:
"This is fairly big pull by my standards as I had missed last merge
window. So we have the support for device tree for slave-dmaengine,
large updates to dw_dmac driver from Andy for reusing on different
architectures. Along with this we have fixes on bunch of the drivers"
Fix up trivial conflicts, usually due to #include line movement next to
each other.
* 'next' of git://git.infradead.org/users/vkoul/slave-dma: (111 commits)
Revert "ARM: SPEAr13xx: Pass DW DMAC platform data from DT"
ARM: dts: pl330: Add #dma-cells for generic dma binding support
DMA: PL330: Register the DMA controller with the generic DMA helpers
DMA: PL330: Add xlate function
DMA: PL330: Add new pl330 filter for DT case.
dma: tegra20-apb-dma: remove unnecessary assignment
edma: do not waste memory for dma_mask
dma: coh901318: set residue only if dma is in progress
dma: coh901318: avoid unbalanced locking
dmaengine.h: remove redundant else keyword
dma: of-dma: protect list write operation by spin_lock
dmaengine: ste_dma40: do not remove descriptors for cyclic transfers
dma: of-dma.c: fix memory leakage
dw_dmac: apply default dma_mask if needed
dmaengine: ioat - fix spare sparse complain
dmaengine: move drivers/of/dma.c -> drivers/dma/of-dma.c
ioatdma: fix race between updating ioat->head and IOAT_COMPLETION_PENDING
dw_dmac: add support for Lynxpoint DMA controllers
dw_dmac: return proper residue value
dw_dmac: fill individual length of descriptor
...
|
|
Prevents build issue with updated toolchain
Reported-by: Jack Thomasson <jkt@moonlitsw.com>
Tested-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Signed-off-by: Jonas Bonn <jonas@southpole.se>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Fixes for blackfin and microblaze build problems introduced by the
removal of global pm_idle. From Lars-Peter Clausen.
- OPP core build fix from Shawn Guo.
- Error condition check fix for the new imx6q-cpufreq driver from Wei
Yongjun.
- Fix for an AER driver crash related to the lack of APEI
initialization for acpi=off. From Rafael J Wysocki.
- Fix for a USB breakage on Thinkpad T430 related to ACPI power
resources and PCI wakeup from Rafael J. Wysocki.
* tag 'pm+acpi-fixes-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / PM: Take unusual configurations of power resources into account
imx6q-cpufreq: fix return value check in imx6q_cpufreq_probe()
PM / OPP: fix condition for empty of_init_opp_table()
ACPI / APEI: Fix crash in apei_hest_parse() for acpi=off
microblaze idle: Fix compile error
blackfin idle: Fix compile error
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"Host bridge hotplug
- Major overhaul of ACPI host bridge add/start (Rafael Wysocki, Yinghai Lu)
- Major overhaul of PCI/ACPI binding (Rafael Wysocki, Yinghai Lu)
- Split out ACPI host bridge and ACPI PCI device hotplug (Yinghai Lu)
- Stop caching _PRT and make independent of bus numbers (Yinghai Lu)
PCI device hotplug
- Clean up cpqphp dead code (Sasha Levin)
- Disable ARI unless device and upstream bridge support it (Yijing Wang)
- Initialize all hot-added devices (not functions 0-7) (Yijing Wang)
Power management
- Don't touch ASPM if disabled (Joe Lawrence)
- Fix ASPM link state management (Myron Stowe)
Miscellaneous
- Fix PCI_EXP_FLAGS accessor (Alex Williamson)
- Disable Bus Master in pci_device_shutdown (Konstantin Khlebnikov)
- Document hotplug resource and MPS parameters (Yijing Wang)
- Add accessor for PCIe capabilities (Myron Stowe)
- Drop pciehp suspend/resume messages (Paul Bolle)
- Make pci_slot built-in only (not a module) (Jiang Liu)
- Remove unused PCI/ACPI bind ops (Jiang Liu)
- Removed used pci_root_bus (Bjorn Helgaas)"
* tag 'pci-v3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits)
PCI/ACPI: Don't cache _PRT, and don't associate them with bus numbers
PCI: Fix PCI Express Capability accessors for PCI_EXP_FLAGS
ACPI / PCI: Make pci_slot built-in only, not a module
PCI/PM: Clear state_saved during suspend
PCI: Use atomic_inc_return() rather than atomic_add_return()
PCI: Catch attempts to disable already-disabled devices
PCI: Disable Bus Master unconditionally in pci_device_shutdown()
PCI: acpiphp: Remove dead code for PCI host bridge hotplug
PCI: acpiphp: Create companion ACPI devices before creating PCI devices
PCI: Remove unused "rc" in virtfn_add_bus()
PCI: pciehp: Drop suspend/resume ENTRY messages
PCI/ASPM: Don't touch ASPM if forcibly disabled
PCI/ASPM: Deallocate upstream link state even if device is not PCIe
PCI: Document MPS parameters pci=pcie_bus_safe, pci=pcie_bus_perf, etc
PCI: Document hpiosize= and hpmemsize= resource reservation parameters
PCI: Use PCI Express Capability accessor
PCI: Introduce accessor to retrieve PCIe Capabilities Register
PCI: Put pci_dev in device tree as early as possible
PCI: Skip attaching driver in device_add()
PCI: acpiphp: Keep driver loaded even if no slots found
...
|
|
Pull KVM ARM compile fixes from Gleb Natapov:
"Fix ARM KVM compilation breakage due to changes from kvm.git"
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
ARM: KVM: fix compilation after removal of user_alloc from struct kvm_memory_slot
ARM: KVM: Rename KVM_MEMORY_SLOTS -> KVM_USER_MEM_SLOTS
ARM: KVM: fix kvm_arch_{prepare,commit}_memory_region
|
|
Pull crypto update from Herbert Xu:
"Here is the crypto update for 3.9:
- Added accelerated implementation of crc32 using pclmulqdq.
- Added test vector for fcrypt.
- Added support for OMAP4/AM33XX cipher and hash.
- Fixed loose crypto_user input checks.
- Misc fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (43 commits)
crypto: user - ensure user supplied strings are nul-terminated
crypto: user - fix empty string test in report API
crypto: user - fix info leaks in report API
crypto: caam - Added property fsl,sec-era in SEC4.0 device tree binding.
crypto: use ERR_CAST
crypto: atmel-aes - adjust duplicate test
crypto: crc32-pclmul - Kill warning on x86-32
crypto: x86/twofish - assembler clean-ups: use ENTRY/ENDPROC, localize jump labels
crypto: x86/sha1 - assembler clean-ups: use ENTRY/ENDPROC
crypto: x86/serpent - use ENTRY/ENDPROC for assember functions and localize jump targets
crypto: x86/salsa20 - assembler cleanup, use ENTRY/ENDPROC for assember functions and rename ECRYPT_* to salsa20_*
crypto: x86/ghash - assembler clean-up: use ENDPROC at end of assember functions
crypto: x86/crc32c - assembler clean-up: use ENTRY/ENDPROC
crypto: cast6-avx: use ENTRY()/ENDPROC() for assembler functions
crypto: cast5-avx: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
crypto: camellia-x86_64/aes-ni: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
crypto: blowfish-x86_64: use ENTRY()/ENDPROC() for assembler functions and localize jump targets
crypto: aesni-intel - add ENDPROC statements for assembler functions
crypto: x86/aes - assembler clean-ups: use ENTRY/ENDPROC, localize jump targets
crypto: testmgr - add test vector for fcrypt
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 update from Tony Luck:
"ia64 vm patch series that was cooking in -mm tree"
* tag 'please-pull-vm_unwrapped' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
mm: use vm_unmapped_area() on ia64 architecture
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module update from Rusty Russell:
"The sweeping change is to make add_taint() explicitly indicate whether
to disable lockdep, but it's a mechanical change."
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
MODSIGN: Add option to not sign modules during modules_install
MODSIGN: Add -s <signature> option to sign-file
MODSIGN: Specify the hash algorithm on sign-file command line
MODSIGN: Simplify Makefile with a Kconfig helper
module: clean up load_module a little more.
modpost: Ignore ARC specific non-alloc sections
module: constify within_module_*
taint: add explicit flag to show whether lock dep is still OK.
module: printk message when module signature fail taints kernel.
|
|
This patch removes page_address() usage in IOMMU-aware dma-mapping
implementation and replaced it with direct use of the cpu virtual address
provided by the caller. page_address() returned incorrect address for
pages remapped in atomic pool, what caused memory leak.
Reported-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
|
|
Alignment order for a dma iommu buffer is set by buffer size. For
large buffer, it is a waste of iommu address space. So configurable
parameter to limit maximum alignment order can reduce the waste.
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Kyungmin.park <kyungmin.park@samsung.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
IOMMU can provide access to any memory page, so there is no point in
limiting the allocated pages only to lowmem, once other parts of
dma-mapping subsystem correctly supports himem pages.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
This patch adds missing pieces to correctly support memory pages served
from CMA regions placed in high memory zones. Please note that the default
global CMA area is still put into lowmem and is limited by optional
architecture specific DMA zone. One can however put device specific CMA
regions in high memory zone to reduce lowmem usage.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
|
|
This patch adds EXPORT_SYMBOL_GPL calls to the three arm iommu
functions - arm_iommu_create_mapping, arm_iommu_free_mapping
and arm_iommu_attach_device. These three functions are arm specific
wrapper functions for creating/freeing/using an iommu mapping and
they are called by various drivers. If any of these drivers need
to be built as dynamic modules, these functions need to be exported.
Changelog v2: using EXPORT_SYMBOL_GPL as suggested by Marek.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
[m.szyprowski: extended with recently introduced
EXPORT_SYMBOL_GPL(arm_iommu_detach_device)]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
A counter part of arm_iommu_attach_device().
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
This can be built without CONFIG_ARM_DMA_USE_IOMMU.
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
struct dma_map_ops iommu_ops doesn't have ->set_dma_mask, which causes
crash when dma_set_mask() is called from some driver.
Signed-off-by: Hiroshi Doyu <hdoyu@nvidia.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
The dma_iommu_mapping structure defined in asm/dma-iommu.h embeds a
struct kref, include the appropriate header file.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
|
|
kvm_memory_slot
Commit 7a905b1 (KVM: Remove user_alloc from struct kvm_memory_slot)
broke KVM/ARM by removing the user_alloc field from a public structure.
As we only used this field to alert the user that we didn't support
this operation mode, there is no harm in discarding this bit of code
without any remorse.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
Commit bbacc0c (KVM: Rename KVM_MEMORY_SLOTS -> KVM_USER_MEM_SLOTS)
broke KVM/ARM by changing a global #define.
Apply the same change to fix the compilation breakage.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
Commit f82a8cfe9 (KVM: struct kvm_memory_slot.user_alloc -> bool)
broke the ARM KVM port by changing the prototype of two global
functions.
Apply the same change to fix the compilation breakage.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
Pull MFS updates from Samuel Ortiz:
"This is the MFD pull request for the 3.9 merge window.
No new drivers this time, but a bunch of fairly big cleanups:
- Roger Quadros worked on a OMAP USBHS and TLL platform data
consolidation, OMAP5 support and clock management code cleanup.
- The first step of a major sync for the ab8500 driver from Lee
Jones. In particular, the debugfs and the sysct interfaces got
extended and improved.
- Peter Ujfalusi sent a nice patchset for cleaning and fixing the
twl-core driver, with a much needed module id lookup code
improvement.
- The regular wm5102 and arizona cleanups and fixes from Mark Brown.
- Laxman Dewangan extended the palmas APIs in order to implement the
palmas GPIO and rt drivers.
- Laxman also added DT support for the tps65090 driver.
- The Intel SCH and ICH drivers got a couple fixes from Aaron Sierra
and Darren Hart.
- Linus Walleij patchset for the ab8500 driver allowed ab8500 and
ab9540 based devices to switch to the new abx500 pin-ctrl driver.
- The max8925 now has device tree and irqdomain support thanks to
Qing Xu.
- The recently added rtsx driver got a few cleanups and fixes for a
better card detection code path and now also supports the RTS5227
chipset, thanks to Wei Wang and Roger Tseng."
* tag 'mfd-3.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (109 commits)
mfd: lpc_ich: Use devres API to allocate private data
mfd: lpc_ich: Add Device IDs for Intel Wellsburg PCH
mfd: lpc_sch: Accomodate partial population of the MFD devices
mfd: da9052-i2c: Staticize da9052_i2c_fix()
mfd: syscon: Fix sparse warning
mfd: twl-core: Fix kernel panic on boot
mfd: rtsx: Fix issue that booting OS with SD card inserted
mfd: ab8500: Fix compile error
mfd: Add missing GENERIC_HARDIRQS dependecies
Documentation: Add docs for max8925 dt
mfd: max8925: Add dts
mfd: max8925: Support dt for backlight
mfd: max8925: Fix onkey driver irq base
mfd: max8925: Fix mfd device register failure
mfd: max8925: Add irqdomain for dt
mfd: vexpress: Allow vexpress-sysreg to self-initialise
mfd: rtsx: Support RTS5227
mfd: rtsx: Implement driving adjustment to device-dependent callbacks
mfd: vexpress: Add pseudo-GPIO based LEDs
mfd: ab8500: Rename ab8500 to abx500 for hwmon driver
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- Some cleanups at V4L2 documentation
- new drivers: ts2020 frontend, ov9650 sensor, s5c73m3 sensor,
sh-mobile veu mem2mem driver, radio-ma901, davinci_vpfe staging
driver
- Lots of missing MAINTAINERS entries added
- several em28xx driver improvements, including its conversion to
videobuf2
- several fixups on drivers to make them to better comply with the API
- DVB core: add support for DVBv5 stats, allowing the implementation of
statistics for new standards like ISDB
- mb86a20s: add statistics to the driver
- lots of new board additions, cleanups, and driver improvements.
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (596 commits)
[media] media: Add 0x3009 USB PID to ttusb2 driver (fixed diff)
[media] rtl28xxu: Add USB IDs for Compro VideoMate U620F
[media] em28xx: add usb id for terratec h5 rev. 3
[media] media: rc: gpio-ir-recv: add support for device tree parsing
[media] mceusb: move check earlier to make smatch happy
[media] radio-si470x doc: add info about v4l2-ctl and sox+alsa
[media] staging: media: Remove unnecessary OOM messages
[media] sh_vou: Use vou_dev instead of vou_file wherever possible
[media] sh_vou: Use video_drvdata()
[media] drivers/media/platform/soc_camera/pxa_camera.c: use devm_ functions
[media] mt9t112: mt9t111 format set up differs from mt9t112
[media] sh-mobile-ceu-camera: fix SHARPNESS control default
Revert "[media] fc0011: Return early, if the frequency is already tuned"
[media] cx18/ivtv: fix regression: remove __init from a non-init function
[media] em28xx: fix analog streaming with USB bulk transfers
[media] stv0900: remove unnecessary null pointer check
[media] fc0011: Return early, if the frequency is already tuned
[media] fc0011: Add some sanity checks and cleanups
[media] fc0011: Fix xin value clamping
Revert "[media] [PATH,1/2] mxl5007 move reset to attach"
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen update from Konrad Rzeszutek Wilk:
"This has two new ACPI drivers for Xen - a physical CPU offline/online
and a memory hotplug. The way this works is that ACPI kicks the
drivers and they make the appropiate hypercall to the hypervisor to
tell it that there is a new CPU or memory. There also some changes to
the Xen ARM ABIs and couple of fixes. One particularly nasty bug in
the Xen PV spinlock code was fixed by Stefan Bader - and has been
there since the 2.6.32!
Features:
- Xen ACPI memory and CPU hotplug drivers - allowing Xen hypervisor
to be aware of new CPU and new DIMMs
- Cleanups
Bug-fixes:
- Fixes a long-standing bug in the PV spinlock wherein we did not
kick VCPUs that were in a tight loop.
- Fixes in the error paths for the event channel machinery"
Fix up a few semantic conflicts with the ACPI interface changes in
drivers/xen/xen-acpi-{cpu,mem}hotplug.c.
* tag 'stable/for-linus-3.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: event channel arrays are xen_ulong_t and not unsigned long
xen: Send spinlock IPI to all waiters
xen: introduce xen_remap, use it instead of ioremap
xen: close evtchn port if binding to irq fails
xen-evtchn: correct comment and error output
xen/tmem: Add missing %s in the printk statement.
xen/acpi: move xen_acpi_get_pxm under CONFIG_XEN_DOM0
xen/acpi: ACPI cpu hotplug
xen/acpi: Move xen_acpi_get_pxm to Xen's acpi.h
xen/stub: driver for CPU hotplug
xen/acpi: ACPI memory hotplug
xen/stub: driver for memory hotplug
xen: implement updated XENMEM_add_to_physmap_range ABI
xen/smp: Move the common CPU init code a bit to prep for PVH patch.
|
|
Pull KVM updates from Marcelo Tosatti:
"KVM updates for the 3.9 merge window, including x86 real mode
emulation fixes, stronger memory slot interface restrictions, mmu_lock
spinlock hold time reduction, improved handling of large page faults
on shadow, initial APICv HW acceleration support, s390 channel IO
based virtio, amongst others"
* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
Revert "KVM: MMU: lazily drop large spte"
x86: pvclock kvm: align allocation size to page size
KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
x86 emulator: fix parity calculation for AAD instruction
KVM: PPC: BookE: Handle alignment interrupts
booke: Added DBCR4 SPR number
KVM: PPC: booke: Allow multiple exception types
KVM: PPC: booke: use vcpu reference from thread_struct
KVM: Remove user_alloc from struct kvm_memory_slot
KVM: VMX: disable apicv by default
KVM: s390: Fix handling of iscs.
KVM: MMU: cleanup __direct_map
KVM: MMU: remove pt_access in mmu_set_spte
KVM: MMU: cleanup mapping-level
KVM: MMU: lazily drop large spte
KVM: VMX: cleanup vmx_set_cr0().
KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
KVM: VMX: disable SMEP feature when guest is in non-paging mode
KVM: Remove duplicate text in api.txt
Revert "KVM: MMU: split kvm_mmu_free_page"
...
|
|
The next change will remove the code from the dw_mmc-exynos that added
the DW_MCI_QUIRK_NO_WRITE_PROTECT. Keep existing functionality of
having no write protect pin on smdk5250 by adding the disable-wp
property.
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Seungwon Jeon <tgih.jun@samsung.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Chris Ball <cjb@laptop.org>
|
|
and pmd_huge
Without this patch any kernel code that reads kernel memory in
non present kernel pte/pmds (as set by pageattr.c) will crash.
With this kernel code:
static struct page *crash_page;
static unsigned long *crash_address;
[..]
crash_page = alloc_pages(GFP_KERNEL, 9);
crash_address = page_address(crash_page);
if (set_memory_np((unsigned long)crash_address, 1))
printk("set_memory_np failure\n");
[..]
The kernel will crash if inside the "crash tool" one would try
to read the memory at the not present address.
crash> p crash_address
crash_address = $8 = (long unsigned int *) 0xffff88023c000000
crash> rd 0xffff88023c000000
[ *lockup* ]
The lockup happens because _PAGE_GLOBAL and _PAGE_PROTNONE
shares the same bit, and pageattr leaves _PAGE_GLOBAL set on a
kernel pte which is then mistaken as _PAGE_PROTNONE (so
pte_present returns true by mistake and the kernel fault then
gets confused and loops).
With THP the same can happen after we taught pmd_present to
check _PAGE_PROTNONE and _PAGE_PSE in commit
027ef6c87853b0a9df5317 ("mm: thp: fix pmd_present for
split_huge_page and PROT_NONE with THP"). THP has the same
problem with _PAGE_GLOBAL as the 4k pages, but it also has a
problem with _PAGE_PSE, which must be cleared too.
After the patch is applied copy_user correctly returns -EFAULT
and doesn't lockup anymore.
crash> p crash_address
crash_address = $9 = (long unsigned int *) 0xffff88023c000000
crash> rd 0xffff88023c000000
rd: read error: kernel virtual address: ffff88023c000000 type:
"64-bit KVADDR"
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
the PRESENT bit"
I got a report for a minor regression introduced by commit
027ef6c87853b ("mm: thp: fix pmd_present for split_huge_page and
PROT_NONE with THP").
So the problem is, pageattr creates kernel pagetables (pte and
pmds) that breaks pte_present/pmd_present and the patch above
exposed this invariant breakage for pmd_present.
The same problem already existed for the pte and pte_present and
it was fixed by commit 660a293ea9be709 ("x86, mm: Make
spurious_fault check explicitly check the PRESENT bit") (if it
wasn't for that commit, it wouldn't even be a regression). That
fix avoids the pagefault to use pte_present. I could follow
through by stopping using pmd_present/pmd_huge too.
However I think it's more robust to fix pageattr and to clear
the PSE/GLOBAL bitflags too in addition to the present bitflag.
So the kernel page fault can keep using the regular
pte_present/pmd_present/pmd_huge.
The confusion arises because _PAGE_GLOBAL and _PAGE_PROTNONE are
sharing the same bit, and in the pmd case we pretend _PAGE_PSE
to be set only in present pmds (to facilitate split_huge_page
final tlb flush).
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
If we aren't debugging per_cpu maps, the cpu's node is stored in
per_cpu variable numa_node. If `node' is NUMA_NO_NODE, it means
the caller wants to clear the cpu's node. So we should also
call set_cpu_numa_node() in this case.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal
Pull signal handling cleanups from Al Viro:
"This is the first pile; another one will come a bit later and will
contain SYSCALL_DEFINE-related patches.
- a bunch of signal-related syscalls (both native and compat)
unified.
- a bunch of compat syscalls switched to COMPAT_SYSCALL_DEFINE
(fixing several potential problems with missing argument
validation, while we are at it)
- a lot of now-pointless wrappers killed
- a couple of architectures (cris and hexagon) forgot to save
altstack settings into sigframe, even though they used the
(uninitialized) values in sigreturn; fixed.
- microblaze fixes for delivery of multiple signals arriving at once
- saner set of helpers for signal delivery introduced, several
architectures switched to using those."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (143 commits)
x86: convert to ksignal
sparc: convert to ksignal
arm: switch to struct ksignal * passing
alpha: pass k_sigaction and siginfo_t using ksignal pointer
burying unused conditionals
make do_sigaltstack() static
arm64: switch to generic old sigaction() (compat-only)
arm64: switch to generic compat rt_sigaction()
arm64: switch compat to generic old sigsuspend
arm64: switch to generic compat rt_sigqueueinfo()
arm64: switch to generic compat rt_sigpending()
arm64: switch to generic compat rt_sigprocmask()
arm64: switch to generic sigaltstack
sparc: switch to generic old sigsuspend
sparc: COMPAT_SYSCALL_DEFINE does all sign-extension as well as SYSCALL_DEFINE
sparc: kill sign-extending wrappers for native syscalls
kill sparc32_open()
sparc: switch to use of generic old sigaction
sparc: switch sys_compat_rt_sigaction() to COMPAT_SYSCALL_DEFINE
mips: switch to generic sys_fork() and sys_clone()
...
|
|
Merge second patch-bomb from Andrew Morton:
- A little DM fix
- the MM queue
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (154 commits)
ksm: allocate roots when needed
mm: cleanup "swapcache" in do_swap_page
mm,ksm: swapoff might need to copy
mm,ksm: FOLL_MIGRATION do migration_entry_wait
ksm: shrink 32-bit rmap_item back to 32 bytes
ksm: treat unstable nid like in stable tree
ksm: add some comments
tmpfs: fix mempolicy object leaks
tmpfs: fix use-after-free of mempolicy object
mm/fadvise.c: drain all pagevecs if POSIX_FADV_DONTNEED fails to discard all pages
mm: export mmu notifier invalidates
mm: accelerate mm_populate() treatment of THP pages
mm: use long type for page counts in mm_populate() and get_user_pages()
mm: accurately document nr_free_*_pages functions with code comments
HWPOISON: change order of error_states[]'s elements
HWPOISON: fix misjudgement of page_action() for errors on mlocked pages
memcg: stop warning on memcg_propagate_kmem
net: change type of virtio_chan->p9_max_pages
vmscan: change type of vm_total_pages to unsigned long
fs/nfsd: change type of max_delegations, nfsd_drc_max_mem and nfsd_drc_mem_used
...
|
|
Now the function nr_free_buffer_pages returns unsigned long, so use %ld
to print its return value.
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
swap_lock is heavily contended when I test swap to 3 fast SSD (even
slightly slower than swap to 2 such SSD). The main contention comes
from swap_info_get(). This patch tries to fix the gap with adding a new
per-partition lock.
Global data like nr_swapfiles, total_swap_pages, least_priority and
swap_list are still protected by swap_lock.
nr_swap_pages is an atomic now, it can be changed without swap_lock. In
theory, it's possible get_swap_page() finds no swap pages but actually
there are free swap pages. But sounds not a big problem.
Accessing partition specific data (like scan_swap_map and so on) is only
protected by swap_info_struct.lock.
Changing swap_info_struct.flags need hold swap_lock and
swap_info_struct.lock, because scan_scan_map() will check it. read the
flags is ok with either the locks hold.
If both swap_lock and swap_info_struct.lock must be hold, we always hold
the former first to avoid deadlock.
swap_entry_free() can change swap_list. To delete that code, we add a
new highest_priority_index. Whenever get_swap_page() is called, we
check it. If it's valid, we use it.
It's a pity get_swap_page() still holds swap_lock(). But in practice,
swap_lock() isn't heavily contended in my test with this patch (or I can
say there are other much more heavier bottlenecks like TLB flush). And
BTW, looks get_swap_page() doesn't really need the lock. We never free
swap_info[] and we check SWAP_WRITEOK flag. The only risk without the
lock is we could swapout to some low priority swap, but we can quickly
recover after several rounds of swap, so sounds not a big deal to me.
But I'd prefer to fix this if it's a real problem.
"swap: make each swap partition have one address_space" improved the
swapout speed from 1.7G/s to 2G/s. This patch further improves the
speed to 2.3G/s, so around 15% improvement. It's a multi-process test,
so TLB flush isn't the biggest bottleneck before the patches.
[arnd@arndb.de: fix it for nommu]
[hughd@google.com: add missing unlock]
[minchan@kernel.org: get rid of lockdep whinge on sys_swapon]
Signed-off-by: Shaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We now provide an option for users who don't want to specify physical
memory address in kernel commandline.
/*
* For movablemem_map=acpi:
*
* SRAT: |_____| |_____| |_________| |_________| ......
* node id: 0 1 1 2
* hotpluggable: n y y n
* movablemem_map: |_____| |_________|
*
* Using movablemem_map, we can prevent memblock from allocating memory
* on ZONE_MOVABLE at boot time.
*/
So user just specify movablemem_map=acpi, and the kernel will use
hotpluggable info in SRAT to determine which memory ranges should be set
as ZONE_MOVABLE.
If all the memory ranges in SRAT is hotpluggable, then no memory can be
used by kernel. But before parsing SRAT, memblock has already reserve
some memory ranges for other purposes, such as for kernel image, and so
on. We cannot prevent kernel from using these memory. So we need to
exclude these ranges even if these memory is hotpluggable.
Furthermore, there could be several memory ranges in the single node
which the kernel resides in. We may skip one range that have memory
reserved by memblock, but if the rest of memory is too small, then the
kernel will fail to boot. So, make the whole node which the kernel
resides in un-hotpluggable. Then the kernel has enough memory to use.
NOTE: Using this way will cause NUMA performance down because the
whole node will be set as ZONE_MOVABLE, and kernel cannot use memory
on it. If users don't want to lose NUMA performance, just don't use
it.
[akpm@linux-foundation.org: fix warning]
[akpm@linux-foundation.org: use strcmp()]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When implementing movablemem_map boot option, we introduced an array
movablemem_map.map[] to store the memory ranges to be set as
ZONE_MOVABLE.
Since ZONE_MOVABLE is the latst zone of a node, if user didn't specify
the whole node memory range, we need to extend it to the node end so
that we can use it to prevent memblock from allocating memory in the
ranges user didn't specify.
We now implement movablemem_map boot option like this:
/*
* For movablemem_map=nn[KMG]@ss[KMG]:
*
* SRAT: |_____| |_____| |_________| |_________| ......
* node id: 0 1 1 2
* user specified: |__| |___|
* movablemem_map: |___| |_________| |______| ......
*
* Using movablemem_map, we can prevent memblock from allocating memory
* on ZONE_MOVABLE at boot time.
*
* NOTE: In this case, SRAT info will be ingored.
*/
[akpm@linux-foundation.org: clean up code, fix build warning]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
On linux, the pages used by kernel could not be migrated. As a result,
if a memory range is used by kernel, it cannot be hot-removed. So if we
want to hot-remove memory, we should prevent kernel from using it.
The way now used to prevent this is specify a memory range by
movablemem_map boot option and set it as ZONE_MOVABLE.
But when the system is booting, memblock will allocate memory, and
reserve the memory for kernel. And before we parse SRAT, and know the
node memory ranges, memblock is working. And it may allocate memory in
ranges to be set as ZONE_MOVABLE. This memory can be used by kernel,
and never be freed.
So, let's parse SRAT before memblock is called first. And it is early
enough.
The first call of memblock_find_in_range_node() is in:
setup_arch()
|-->setup_real_mode()
so, this patch add a function early_parse_srat() to parse SRAT, and call
it before setup_real_mode() is called.
NOTE:
1) early_parse_srat() is called before numa_init(), and has initialized
numa_meminfo. So DO NOT clear numa_nodes_parsed in numa_init() and DO
NOT zero numa_meminfo in numa_init(), otherwise we will lose memory
numa info.
2) I don't know why using count of memory affinities parsed from SRAT
as a return value in original acpi_numa_init(). So I add a static
variable srat_mem_cnt to remember this count and use it as the return
value of the new acpi_numa_init()
[mhocko@suse.cz: parse SRAT before memblock is ready fix]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
During the implementation of SRAT support, we met a problem. In
setup_arch(), we have the following call series:
1) memblock is ready;
2) some functions use memblock to allocate memory;
3) parse ACPI tables, such as SRAT.
Before 3), we don't know which memory is hotpluggable, and as a result,
we cannot prevent memblock from allocating hotpluggable memory. So, in
2), there could be some hotpluggable memory allocated by memblock.
Now, we are trying to parse SRAT earlier, before memblock is ready. But
I think we need more investigation on this topic. So in this v5, I
dropped all the SRAT support, and v5 is just the same as v3, and it is
based on 3.8-rc3.
As we planned, we will support getting info from SRAT without users'
participation at last. And we will post another patch-set to do so.
And also, I think for now, we can add this boot option as the first step
of supporting movable node. Since Linux cannot migrate the direct
mapped pages, the only way for now is to limit the whole node containing
only movable memory.
Using SRAT is one way. But even if we can use SRAT, users still need an
interface to enable/disable this functionality if they don't want to
loose their NUMA performance. So I think, a user interface is always
needed.
For now, users can disable this functionality by not specifying the boot
option. Later, we will post SRAT support, and add another option value
"movablecore_map=acpi" to using SRAT.
This patch:
If system can create movable node which all memory of the node is
allocated as ZONE_MOVABLE, setup_node_data() cannot allocate memory for
the node's pg_data_t. So, use memblock_alloc_try_nid() instead of
memblock_alloc_nid() to retry when the first allocation fails.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When the node is offlined, there is no memory/cpu on the node. If a
sleep task runs on a cpu of this node, it will be migrated to the cpu on
the other node. So we can clear cpu-to-node mapping.
[akpm@linux-foundation.org: numa_clear_node() and numa_set_node() can no longer be __cpuinit]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When a cpu is hotpluged, we call acpi_map_cpu2node() in
_acpi_map_lsapic() to store the cpu's node and apicid's node. But we
don't clear the cpu's node in acpi_unmap_lsapic() when this cpu is
hotremoved. If the node is also hotremoved, we will get the following
messages:
kernel BUG at include/linux/gfp.h:329!
invalid opcode: 0000 [#1] SMP
Modules linked in: ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm crc32c_intel microcode pcspkr i2c_i801 i2c_core lpc_ich mfd_core ioatdma e1000e i7core_edac edac_core sg acpi_memhotplug igb dca sd_mod crc_t10dif megaraid_sas mptsas mptscsih mptbase scsi_transport_sas scsi_mod
Pid: 3126, comm: init Not tainted 3.6.0-rc3-tangchen-hostbridge+ #13 FUJITSU-SV PRIMEQUEST 1800E/SB
RIP: 0010:[<ffffffff811bc3fd>] [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
RSP: 0018:ffff88078a049cf8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000246
RBP: ffff88078a049d38 R08: 00000000000040d0 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000b5f R12: 00000000000052d0
R13: ffff8807c1417300 R14: 0000000000030038 R15: 0000000000000003
FS: 00007fa9b1b44700(0000) GS:ffff8807c3800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fa9b09acca0 CR3: 000000078b855000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process init (pid: 3126, threadinfo ffff88078a048000, task ffff8807bb6f2650)
Call Trace:
new_slab+0x30/0x1b0
__slab_alloc+0x358/0x4c0
kmem_cache_alloc_node_trace+0xb4/0x1e0
alloc_fair_sched_group+0xd0/0x1b0
sched_create_group+0x3e/0x110
sched_autogroup_create_attach+0x4d/0x180
sys_setsid+0xd4/0xf0
system_call_fastpath+0x16/0x1b
Code: 89 c4 e9 73 fe ff ff 31 c0 89 de 48 c7 c7 45 de 9e 81 44 89 45 c8 e8 22 05 4b 00 85 db 44 8b 45 c8 0f 89 4f ff ff ff 0f 0b eb fe <0f> 0b 90 eb fd 0f 0b eb fe 89 de 48 c7 c7 45 de 9e 81 31 c0 44
RIP [<ffffffff811bc3fd>] allocate_slab+0x28d/0x300
RSP <ffff88078a049cf8>
---[ end trace adf84c90f3fea3e5 ]---
The reason is that the cpu's node is not NUMA_NO_NODE, we will call
alloc_pages_exact_node() to alloc memory on the node, but the node is
offlined.
If the node is onlined, we still need cpu's node. For example: a task
on the cpu is sleeped when the cpu is hotremoved. We will choose
another cpu to run this task when it is waked up. If we know the cpu's
node, we will choose the cpu on the same node first. So we should clear
cpu-to-node mapping when the node is offlined.
This patch only clears apicid-to-node mapping when the cpu is
hotremoved.
[akpm@linux-foundation.org: fix section error]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Introduce a new API vmemmap_free() to free and remove vmemmap
pagetables. Since pagetable implements are different, each architecture
has to provide its own version of vmemmap_free(), just like
vmemmap_populate().
Note: vmemmap_free() is not implemented for ia64, ppc, s390, and sparc.
[mhocko@suse.cz: fix implicit declaration of remove_pagetable]
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Search a page table about the removed memory, and clear page table for
x86_64 architecture.
[akpm@linux-foundation.org: make kernel_physical_mapping_remove() static]
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When memory is removed, the corresponding pagetables should alse be
removed. This patch introduces some common APIs to support vmemmap
pagetable and x86_64 architecture direct mapping pagetable removing.
All pages of virtual mapping in removed memory cannot be freed if some
pages used as PGD/PUD include not only removed memory but also other
memory. So this patch uses the following way to check whether a page
can be freed or not.
1) When removing memory, the page structs of the removed memory are
filled with 0FD.
2) All page structs are filled with 0xFD on PT/PMD, PT/PMD can be
cleared. In this case, the page used as PT/PMD can be freed.
For direct mapping pages, update direct_pages_count[level] when we freed
their pagetables. And do not free the pages again because they were
freed when offlining.
For vmemmap pages, free the pages and their pagetables.
For larger pages, do not split them into smaller ones because there is
no way to know if the larger page has been split. As a result, there is
no way to decide when to split. We deal the larger pages in the
following way:
1) For direct mapped pages, all the pages were freed when they were
offlined. And since menmory offline is done section by section, all
the memory ranges being removed are aligned to PAGE_SIZE. So only need
to deal with unaligned pages when freeing vmemmap pages.
2) For vmemmap pages being used to store page_struct, if part of the
larger page is still in use, just fill the unused part with 0xFD. And
when the whole page is fulfilled with 0xFD, then free the larger page.
[akpm@linux-foundation.org: fix typo in comment]
[tangchen@cn.fujitsu.com: do not calculate direct mapping pages when freeing vmemmap pagetables]
[tangchen@cn.fujitsu.com: do not free direct mapping pages twice]
[tangchen@cn.fujitsu.com: do not free page split from hugepage one by one]
[tangchen@cn.fujitsu.com: do not split pages when freeing pagetable pages]
[akpm@linux-foundation.org: use pmd_page_vaddr()]
[akpm@linux-foundation.org: fix used-uninitialised bug]
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|