summaryrefslogtreecommitdiff
path: root/drivers/xen
AgeCommit message (Collapse)AuthorFilesLines
2012-01-25xen/xenbus: Reject replies with payload > XENSTORE_PAYLOAD_MAX.Ian Campbell1-0/+6
commit 9e7860cee18241633eddb36a4c34c7b61d8cecbc upstream. Haogang Chen found out that: There is a potential integer overflow in process_msg() that could result in cross-domain attack. body = kmalloc(msg->hdr.len + 1, GFP_NOIO | __GFP_HIGH); When a malicious guest passes 0xffffffff in msg->hdr.len, the subsequent call to xb_read() would write to a zero-length buffer. The other end of this connection is always the xenstore backend daemon so there is no guest (malicious or otherwise) which can do this. The xenstore daemon is a trusted component in the system. However this seem like a reasonable robustness improvement so we should have it. And Ian when read the API docs found that: The payload length (len field of the header) is limited to 4096 (XENSTORE_PAYLOAD_MAX) in both directions. If a client exceeds the limit, its xenstored connection will be immediately killed by xenstored, which is usually catastrophic from the client's point of view. Clients (particularly domains, which cannot just reconnect) should avoid this. so this patch checks against that instead. This also avoids a potential integer overflow pointed out by Haogang Chen. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Haogang Chen <haogangchen@gmail.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-19Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from ↵Konrad Rzeszutek Wilk1-13/+0
old kernel" This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05. As when booting the kernel under Amazon EC2 as an HVM guest it ends up hanging during startup. Reverting this we loose the fix for kexec booting to the crash kernels. Fixes Canonical BZ #901305 (http://bugs.launchpad.net/bugs/901305) Tested-by: Alessandro Salvatori <sandr8@gmail.com> Reported-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-15xen/swiotlb: Use page alignment for early buffer allocation.Konrad Rzeszutek Wilk1-2/+2
This fixes an odd bug found on a Dell PowerEdge 1850/0RC130 (BIOS A05 01/09/2006) where all of the modules doing pci_set_dma_mask would fail with: ata_piix 0000:00:1f.1: enabling device (0005 -> 0007) ata_piix 0000:00:1f.1: can't derive routing for PCI INT A ata_piix 0000:00:1f.1: BMDMA: failed to set dma mask, falling back to PIO The issue was the Xen-SWIOTLB was allocated such as that the end of buffer was stradling a page (and also above 4GB). The fix was spotted by Kalev Leonid which was to piggyback on git commit e79f86b2ef9c0a8c47225217c1018b7d3d90101c "swiotlb: Use page alignment for early buffer allocation" which: We could call free_bootmem_late() if swiotlb is not used, and it will shrink to page alignment. So alloc them with page alignment at first, to avoid lose two pages And doing that fixes the outstanding issue. CC: stable@kernel.org Suggested-by: "Kalev, Leonid" <Leonid.Kalev@ca.com> Reported-and-Tested-by: "Taylor, Neal E" <Neal.Taylor@ca.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-16xen-gntalloc: signedness bug in add_grefs()Dan Carpenter1-1/+1
gref->gref_id is unsigned so the error handling didn't work. gnttab_grant_foreign_access() returns an int type, so we can add a cast here, and it doesn't cause any problems. gnttab_grant_foreign_access() can return a variety of errors including -ENOSPC, -ENOSYS and -ENOMEM. CC: stable@kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-16xen-gntalloc: integer overflow in gntalloc_ioctl_alloc()Dan Carpenter1-1/+1
On 32 bit systems a high value of op.count could lead to an integer overflow in the kzalloc() and gref_ids would be smaller than expected. If the you triggered another integer overflow in "if (gref_size + op.count > limit)" then you'd probably get memory corruption inside add_grefs(). CC: stable@kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-16xen-gntdev: integer overflow in gntdev_alloc_map()Dan Carpenter1-5/+5
The multiplications here can overflow resulting in smaller buffer sizes than expected. "count" comes from a copy_from_user(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-16xen/balloon: Avoid OOM when requesting highmemDaniel De Graaf1-2/+2
If highmem pages are requested from the balloon on a system without highmem, the implementation of alloc_xenballooned_pages will allocate all available memory trying to find highmem pages to return. Allow low memory to be returned when highmem pages are requested to avoid this loop. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-16xen: map foreign pages for shared rings by updating the PTEs directlyDavid Vrabel1-3/+8
When mapping a foreign page with xenbus_map_ring_valloc() with the GNTTABOP_map_grant_ref hypercall, set the GNTMAP_contains_pte flag and pass a pointer to the PTE (in init_mm). After the page is mapped, the usual fault mechanism can be used to update additional MMs. This allows the vmalloc_sync_all() to be removed from alloc_vm_area(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> [v1: Squashed fix by Michal for no-mmu case] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Michal Simek <monstr@monstr.eu>
2011-11-06Merge branch 'stable/cleanups-3.2' of ↵Linus Torvalds8-31/+25
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/cleanups-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: use static initializers in xen-balloon.c Xen: fix braces and tabs coding style issue in xenbus_probe.c Xen: fix braces coding style issue in xenbus_probe.h Xen: fix whitespaces,tabs coding style issue in drivers/xen/pci.c Xen: fix braces coding style issue in gntdev.c and grant-table.c Xen: fix whitespaces,tabs coding style issue in drivers/xen/events.c Xen: fix whitespaces,tabs coding style issue in drivers/xen/balloon.c Fix up trivial whitespace-conflicts in drivers/xen/{balloon.c,pci.c,xenbus/xenbus_probe.c}
2011-11-06Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds8-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
2011-11-06Merge branch 'stable/vmalloc-3.2' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: net: xen-netback: use API provided by xenbus module to map rings block: xen-blkback: use API provided by xenbus module to map rings xen: use generic functions instead of xen_{alloc, free}_vm_area()
2011-10-31xen: Add export.h for THIS_MODULE/EXPORT_SYMBOL to various xen users.Paul Gortmaker4-0/+4
Things like THIS_MODULE and EXPORT_SYMBOL were simply everywhere because module.h was also everywhere. But we are fixing the latter. So we need to call out the real users in advance. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31xen: Add module.h to modular drivers/xen users.Paul Gortmaker4-0/+4
Previously these drivers just got module.h implicitly, but we are cleaning that up and it will be no longer. Call out the real users of it. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-26Merge branch 'irq-core-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlier genirq: Fix fatfinered fixup really genirq: percpu: allow interrupt type to be set at enable time genirq: Add support for per-cpu dev_id interrupts genirq: Add IRQCHIP_SKIP_SET_WAKE flag
2011-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds1-0/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits) dp83640: free packet queues on remove dp83640: use proper function to free transmit time stamping packets ipv6: Do not use routes from locally generated RAs |PATCH net-next] tg3: add tx_dropped counter be2net: don't create multiple RX/TX rings in multi channel mode be2net: don't create multiple TXQs in BE2 be2net: refactor VF setup/teardown code into be_vf_setup/clear() be2net: add vlan/rx-mode/flow-control config to be_setup() net_sched: cls_flow: use skb_header_pointer() ipv4: avoid useless call of the function check_peer_pmtu TCP: remove TCP_DEBUG net: Fix driver name for mdio-gpio.c ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces ipv4: fix ipsec forward performance regression jme: fix irq storm after suspend/resume route: fix ICMP redirect validation net: hold sock reference while processing tx timestamps tcp: md5: add more const attributes Add ethtool -g support to virtio_net ... Fix up conflicts in: - drivers/net/Kconfig: The split-up generated a trivial conflict with removal of a stale reference to Documentation/networking/net-modules.txt. Remove it from the new location instead. - fs/sysfs/dir.c: Fairly nasty conflicts with the sysfs rb-tree usage, conflicting with Eric Biederman's changes for tagged directories.
2011-10-25Merge branches 'stable/drivers-3.2', 'stable/drivers.bugfixes-3.2' and ↵Linus Torvalds18-195/+447
'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xenbus: don't rely on xen_initial_domain to detect local xenstore xenbus: Fix loopback event channel assuming domain 0 xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain' xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive * 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pciback: Check if the device is found instead of blindly assuming so. xen/pciback: Do not dereference psdev during printk when it is NULL. xen: remove XEN_PLATFORM_PCI config option xen: XEN_PVHVM depends on PCI xen/pciback: double lock typo xen/pciback: use mutex rather than spinlock in vpci backend xen/pciback: Use mutexes when working with Xenbus state transitions. xen/pciback: miscellaneous adjustments xen/pciback: use mutex rather than spinlock in passthrough backend xen/pciback: use resource_size() * 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pci: support multi-segment systems xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs. xen/pci: make bus notifier handler return sane values xen-swiotlb: fix printk and panic args xen-swiotlb: Fix wrong panic. xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB xen-pcifront: Update warning comment to use 'e820_host' option.
2011-10-25Merge branches 'stable/bug.fixes-3.2' and 'stable/mmu.fixes' of ↵Linus Torvalds6-22/+114
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/p2m/debugfs: Make type_name more obvious. xen/p2m/debugfs: Fix potential pointer exception. xen/enlighten: Fix compile warnings and set cx to known value. xen/xenbus: Remove the unnecessary check. xen/irq: If we fail during msi_capability_init return proper error code. xen/events: Don't check the info for NULL as it is already done. xen/events: BUG() when we can't allocate our event->irq array. * 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix selfballooning and ensure it doesn't go too far xen/gntdev: Fix sleep-inside-spinlock xen: modify kernel mappings corresponding to granted pages xen: add an "highmem" parameter to alloc_xenballooned_pages xen/p2m: Use SetPagePrivate and its friends for M2P overrides. xen/p2m: Make debug/xen/mmu/p2m visible again. Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set."
2011-10-19xen/xenbus: Remove the unnecessary check.Konrad Rzeszutek Wilk1-2/+0
.. we check whether 'xdev' is NULL - but there is no need for it as the 'dev' check is done before. The 'dev' is embedded in the 'xdev' so having xdev != NULL with dev being being checked is not going to happen. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-19xen/irq: If we fail during msi_capability_init return proper error code.Konrad Rzeszutek Wilk1-3/+4
There are three different modes: PV, HVM, and initial domain 0. In all the cases we would return -1 for failure instead of a proper error code. Fix this by propagating the error code from the generic IRQ code. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-19xen/events: Don't check the info for NULL as it is already done.Konrad Rzeszutek Wilk1-1/+1
The list operation checks whether the 'info' structure that is retrieved from the list is NULL (otherwise it would not been able to retrieve it). This check is not neccessary. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-19xen/events: BUG() when we can't allocate our event->irq array.Konrad Rzeszutek Wilk1-0/+1
In case we can't allocate we are doomed. We should BUG_ON instead of trying to dereference it later on. Acked-by: Ian Campbell <ian.campbell@citrix.com> [v1: Use BUG_ON instead of BUG] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-19xen/pciback: Check if the device is found instead of blindly assuming so.Konrad Rzeszutek Wilk1-0/+2
Just in case it is not found, don't try to dereference it. [v1: Added WARN_ON, suggested by Jan Beulich <JBeulich@suse.com>] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-19xen/pciback: Do not dereference psdev during printk when it is NULL.Konrad Rzeszutek Wilk1-6/+1
.. instead use BUG_ON() as all the callers of the kill_domain_by_device check for psdev. Suggested-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-17genirq: Add IRQF_RESUME_EARLY and resume such IRQs earlierIan Campbell1-1/+1
This adds a mechanism to resume selected IRQs during syscore_resume instead of dpm_resume_noirq. Under Xen we need to resume IRQs associated with IPIs early enough that the resched IPI is unmasked and we can therefore schedule ourselves out of the stop_machine where the suspend/resume takes place. This issue was introduced by 676dc3cf5bc3 "xen: Use IRQF_FORCE_RESUME". Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jeremy Fitzhardinge <Jeremy.Fitzhardinge@citrix.com> Cc: xen-devel <xen-devel@lists.xensource.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Link: http://lkml.kernel.org/r/1318713254.11016.52.camel@dagon.hellion.org.uk Cc: stable@kernel.org (at least to 2.6.32.y) Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-10-14xen: Fix selfballooning and ensure it doesn't go too farDan Magenheimer1-4/+63
The balloon driver's "current_pages" is very different from totalram_pages. Self-ballooning needs to be driven by the latter. Also, Committed_AS doesn't account for pages used by the kernel so: 1) Add totalreserve_pages to Committed_AS for the normal target. 2) Enforce a floor for when there are little or no user-space threads using memory (e.g. single-user mode) to avoid OOMs. The floor function includes a "min_usable_mb" tuneable in case we discover later that the floor function is still too aggressive in some workloads, though likely it will not be needed. Changes since version 4: - change floor calculation so that it is not as aggressive; this version uses a piecewise linear function similar to minimum_target in the 2.6.18 balloon driver, but modified to add to totalreserve_pages instead of subtract from max_pfn, the 2.6.18 version causes OOMs on recent kernels because the kernel has expanded over time - change safety_margin to min_usable_mb and comment on its use - since committed_as does NOT include kernel space (and other reserved pages), totalreserve_pages is now added to committed_as. The result is less aggressive self-ballooning, but theoretically more appropriate. Changes since version 3: - missing include causes compile problem when CONFIG_FRONTSWAP is disabled - add comments after includes Changes since version 2: - missing include causes compile problem only on 32-bit Changes since version 1: - tuneable safety margin added [v5: avi.miller@oracle.com: still too aggressive, seeing some OOMs] [v4: konrad.wilk@oracle.com: fix compile when CONFIG_FRONTSWAP is disabled] [v3: guru.anbalagane@oracle.com: fix 32-bit compile] [v2: konrad.wilk@oracle.com: make safety margin tuneable] Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> [v1: Altered description and added an extra include] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-14xen/gntdev: Fix sleep-inside-spinlockDaniel De Graaf1-3/+2
BUG: sleeping function called from invalid context at /local/scratch/dariof/linux/kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 3256, name: qemu-dm 1 lock held by qemu-dm/3256: #0: (&(&priv->lock)->rlock){......}, at: [<ffffffff813223da>] gntdev_ioctl+0x2bd/0x4d5 Pid: 3256, comm: qemu-dm Tainted: G W 3.1.0-rc8+ #5 Call Trace: [<ffffffff81054594>] __might_sleep+0x131/0x135 [<ffffffff816bd64f>] mutex_lock_nested+0x25/0x45 [<ffffffff8131c7c8>] free_xenballooned_pages+0x20/0xb1 [<ffffffff8132194d>] gntdev_put_map+0xa8/0xdb [<ffffffff816be546>] ? _raw_spin_lock+0x71/0x7a [<ffffffff813223da>] ? gntdev_ioctl+0x2bd/0x4d5 [<ffffffff8132243c>] gntdev_ioctl+0x31f/0x4d5 [<ffffffff81007d62>] ? check_events+0x12/0x20 [<ffffffff811433bc>] do_vfs_ioctl+0x488/0x4d7 [<ffffffff81007d4f>] ? xen_restore_fl_direct_reloc+0x4/0x4 [<ffffffff8109168b>] ? lock_release+0x21c/0x229 [<ffffffff81135cdd>] ? rcu_read_unlock+0x21/0x32 [<ffffffff81143452>] sys_ioctl+0x47/0x6a [<ffffffff816bfd82>] system_call_fastpath+0x16/0x1b gntdev_put_map tries to acquire a mutex when freeing pages back to the xenballoon pool, so it cannot be called with a spinlock held. In gntdev_release, the spinlock is not needed as we are freeing the structure later; in the ioctl, only the list manipulation needs to be under the lock. Reported-and-Tested-By: Dario Faggioli <dario.faggioli@citrix.com> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-14xenbus: don't rely on xen_initial_domain to detect local xenstoreDaniel De Graaf1-48/+53
The xenstore daemon does not have to run in the xen initial domain; however, Linux currently uses xen_initial_domain to test if a loopback event channel should be used instead of the event channel provided in Xen's start_info structure. Instead, if the event channel passed in the start_info structure is not valid, assume that this domain will run xenstored locally and set up the event channel. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-10-14xenbus: Fix loopback event channel assuming domain 0Daniel De Graaf1-1/+1
The xenbus event channel established in xenbus_init is intended to be a loopback channel, but the remote domain was hardcoded to 0; this will cause the channel to be unusable when xenstore is not being run in domain 0. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen: use generic functions instead of xen_{alloc, free}_vm_area()David Vrabel1-3/+3
Replace calls to the Xen-specific xen_alloc_vm_area() and xen_free_vm_area() functions with the generic equivalent (alloc_vm_area() and free_vm_area()). On x86, these were identical already. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen: allow balloon driver to use more than one memory regionDavid Vrabel1-17/+27
Allow the xen balloon driver to populate its list of extra pages from more than one region of memory. This will allow platforms to provide (for example) a region of low memory and a region of high memory. The maximum possible number of extra regions is 128 (== E820MAX) which is quite large so xen_extra_mem is placed in __initdata. This is safe as both xen_memory_setup() and balloon_init() are in __init. The balloon regions themselves are not altered (i.e., there is still only the one region). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen/balloon: simplify test for the end of usable RAMDavid Vrabel1-9/+9
When initializing the balloon only max_pfn needs to be checked (max_pfn will always be <= e820_end_of_ram_pfn()) and improve the confusing comment. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen/balloon: account for pages released during memory setupDavid Vrabel1-1/+3
In xen_memory_setup() pages that occur in gaps in the memory map are released back to Xen. This reduces the domain's current page count in the hypervisor. The Xen balloon driver does not correctly decrease its initial current_pages count to reflect this. If 'delta' pages are released and the target is adjusted the resulting reservation is always 'delta' less than the requested target. This affects dom0 if the initial allocation of pages overlaps the PCI memory region but won't affect most domU guests that have been setup with pseudo-physical memory maps that don't have gaps. Fix this by accouting for the released pages when starting the balloon driver. If the domain's targets are managed by xapi, the domain may eventually run out of memory and die because xapi currently gets its target calculations wrong and whenever it is restarted it always reduces the target by 'delta'. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen: remove XEN_PLATFORM_PCI config optionStefano Stabellini2-13/+1
Xen PVHVM needs xen-platform-pci, on the other hand xen-platform-pci is useless in any other cases. Therefore remove the XEN_PLATFORM_PCI config option and compile xen-platform-pci built-in if XEN_PVHVM is selected. Changes to v1: - remove xen-platform-pci.o and just use platform-pci.o since it is not externally visible anymore. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen/pciback: double lock typoDan Carpenter1-1/+1
We called mutex_lock() twice instead of unlocking. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen: modify kernel mappings corresponding to granted pagesStefano Stabellini2-4/+34
If we want to use granted pages for AIO, changing the mappings of a user vma and the corresponding p2m is not enough, we also need to update the kernel mappings accordingly. Currently this is only needed for pages that are created for user usages through /dev/xen/gntdev. As in, pages that have been in use by the kernel and use the P2M will not need this special mapping. However there are no guarantees that in the future the kernel won't start accessing pages through the 1:1 even for internal usage. In order to avoid the complexity of dealing with highmem, we allocated the pages lowmem. We issue a HYPERVISOR_grant_table_op right away in m2p_add_override and we remove the mappings using another HYPERVISOR_grant_table_op in m2p_remove_override. Considering that m2p_add_override and m2p_remove_override are called once per page we use multicalls and hypercall batching. Use the kmap_op pointer directly as argument to do the mapping as it is guaranteed to be present up until the unmapping is done. Before issuing any unmapping multicalls, we need to make sure that the mapping has already being done, because we need the kmap->handle to be set correctly. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Removed GRANT_FRAME_BIT usage] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-29xen: add an "highmem" parameter to alloc_xenballooned_pagesStefano Stabellini2-5/+9
Add an highmem parameter to alloc_xenballooned_pages, to allow callers to request lowmem or highmem pages. Fix the code style of free_xenballooned_pages' prototype. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-27xen/pciback: Add flag indicating device has been assigned by XenKonrad Rzeszutek Wilk1-0/+2
Device drivers that create and destroy SR-IOV virtual functions via calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic failures if they attempt to destroy VFs while they are assigned to guest virtual machines. By adding a flag for use by the Xen PCI back to indicate that a device is assigned a device driver can check that flag and avoid destroying VFs while they are assigned and avoid system failures. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-09-26xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'Konrad Rzeszutek Wilk1-0/+1
Randy found a compile error when using make randconfig to trigger drivers/xen/xenbus/xenbus_xs.c:909:2: error: implicit declaration of function 'xen_hvm_domain' it is unclear which of the CONFIG options triggered this. This patch fixes the error. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-22xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernelOlaf Hering1-0/+13
Add new xs_reset_watches function to shutdown watches from old kernel after kexec boot. The old kernel does not unregister all watches in the shutdown path. They are still active, the double registration can not be detected by the new kernel. When the watches fire, unexpected events will arrive and the xenwatch thread will crash (jumps to NULL). An orderly reboot of a hvm guest will destroy the entire guest with all its resources (including the watches) before it is rebuilt from scratch, so the missing unregister is not an issue in that case. With this change the xenstored is instructed to wipe all active watches for the guest. However, a patch for xenstored is required so that it accepts the XS_RESET_WATCHES request from a client (see changeset 23839:42a45baf037d in xen-unstable.hg). Without the patch for xenstored the registration of watches will fail and some features of a PVonHVM guest are not available. The guest is still able to boot, but repeated kexec boots will fail. [v5: use xs_single instead of passing a dummy string to xs_talkv] [v4: ignore -EEXIST in xs_reset_watches] [v3: use XS_RESET_WATCHES instead of XS_INTRODUCE] [v2: move all code which deals with XS_INTRODUCE into xs_introduce() (based on feedback from Ian Campbell); remove casts from kvec assignment] Signed-off-by: Olaf Hering <olaf@aepfle.de> [v1: Redid the git description a bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-22xen/pci: support multi-segment systemsJan Beulich1-10/+84
Now that the hypercall interface changes are in -unstable, make the kernel side code not ignore the segment (aka domain) number anymore (which results in pretty odd behavior on such systems). Rather, if only the old interfaces are available, don't call them for devices on non-zero segments at all. Signed-off-by: Jan Beulich <jbeulich@suse.com> [v1: Edited git description] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-21xen/pciback: use mutex rather than spinlock in vpci backendKonrad Rzeszutek Wilk1-15/+11
Similar to the "xen/pciback: use mutex rather than spinlock in passthrough backend" this patch converts the vpci backend to use a mutex instead of a spinlock. Note that the code taking the lock won't ever get called from non-sleepable context Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-21xen/pciback: Use mutexes when working with Xenbus state transitions.Konrad Rzeszutek Wilk2-14/+10
The caller that orchestrates the state changes is xenwatch_thread and it takes a mutex. In our processing of Xenbus states we can take the luxery of going to sleep on a mutex, so lets do that and also fix this bug: BUG: sleeping function called from invalid context at /linux/kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 32, name: xenwatch 2 locks held by xenwatch/32: #0: (xenwatch_mutex){......}, at: [<ffffffff813856ab>] xenwatch_thread+0x4b/0x180 #1: (&(&pdev->dev_lock)->rlock){......}, at: [<ffffffff8138f05b>] xen_pcibk_disconnect+0x1b/0x80 Pid: 32, comm: xenwatch Not tainted 3.1.0-rc6-00015-g3ce340d #2 Call Trace: [<ffffffff810892b2>] __might_sleep+0x102/0x130 [<ffffffff8163b90f>] mutex_lock_nested+0x2f/0x50 [<ffffffff81382c1c>] unbind_from_irq+0x2c/0x1b0 [<ffffffff8110da66>] ? free_irq+0x56/0xb0 [<ffffffff81382dbc>] unbind_from_irqhandler+0x1c/0x30 [<ffffffff8138f06b>] xen_pcibk_disconnect+0x2b/0x80 [<ffffffff81390348>] xen_pcibk_frontend_changed+0xe8/0x140 [<ffffffff81387ac2>] xenbus_otherend_changed+0xd2/0x150 [<ffffffff810895c1>] ? get_parent_ip+0x11/0x50 [<ffffffff81387de0>] frontend_changed+0x10/0x20 [<ffffffff81385712>] xenwatch_thread+0xb2/0x180 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-21xen/pciback: miscellaneous adjustmentsJan Beulich9-44/+44
This is a minor bugfix and a set of small cleanups; as it is not clear whether this needs splitting into pieces (and if so, at what granularity), it is a single combined patch. - add a missing return statement to an error path in kill_domain_by_device() - use pci_is_enabled() rather than raw atomic_read() - remove a bogus attempt to zero-terminate an already zero-terminated string - #define DRV_NAME once uniformly in the shared local header - make DRIVER_ATTR() variables static - eliminate a pointless use of list_for_each_entry_safe() - add MODULE_ALIAS() - a little bit of constification - adjust a few messages - remove stray semicolons from inline function definitions Signed-off-by: Jan Beulich <jbeulich@suse.com> [v1: Dropped the resource_size fix, altered the description] [v2: Fixed cleanpatch.pl comments] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-21xen/pciback: use mutex rather than spinlock in passthrough backendJan Beulich1-19/+13
To accommodate the call to the callback function from __xen_pcibk_publish_pci_roots(), which so far dropped and the re- acquired the lock without checking that the list didn't actually change, convert the code to use a mutex instead (observing that the code taking the lock won't ever get called from non-sleepable context). As a result, drop the bogus use of list_for_each_entry_safe() and remove the inappropriate dropping of the lock. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-21xen/pciback: use resource_size()Thomas Meyer1-1/+1
Use resource_size function on resource object instead of explicit computation. The semantic patch that makes this output is available in scripts/coccinelle/api/resource_size.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-15xen/irq: Alter the locking to use a mutex instead of a spinlock.Konrad Rzeszutek Wilk1-20/+20
When we allocate/change the IRQ informations, we do not need to use spinlocks. We can use a mutex (which is what the generic IRQ code does for allocations/changes). Fixes a slew of: BUG: sleeping function called from invalid context at /linux/kernel/mutex.c:271 in_atomic(): 1, irqs_disabled(): 0, pid: 3216, name: xenstored 2 locks held by xenstored/3216: #0: (&u->bind_mutex){......}, at: [<ffffffffa02e0920>] evtchn_ioctl+0x30/0x3a0 [xen_evtchn] #1: (irq_mapping_update_lock){......}, at: [<ffffffff8138b274>] bind_evtchn_to_irq+0x24/0x90 Pid: 3216, comm: xenstored Not tainted 3.1.0-rc6-00021-g437a3d1 #2 Call Trace: [<ffffffff81088d10>] __might_sleep+0x100/0x130 [<ffffffff81645c2f>] mutex_lock_nested+0x2f/0x50 [<ffffffff81627529>] __irq_alloc_descs+0x49/0x200 [<ffffffffa02e0920>] ? evtchn_ioctl+0x30/0x3a0 [xen_evtchn] [<ffffffff8138b214>] xen_allocate_irq_dynamic+0x34/0x70 [<ffffffff8138b2ad>] bind_evtchn_to_irq+0x5d/0x90 [<ffffffffa02e03c0>] ? evtchn_bind_to_user+0x60/0x60 [xen_evtchn] [<ffffffff8138c282>] bind_evtchn_to_irqhandler+0x32/0x80 [<ffffffffa02e03a9>] evtchn_bind_to_user+0x49/0x60 [xen_evtchn] [<ffffffffa02e0a34>] evtchn_ioctl+0x144/0x3a0 [xen_evtchn] [<ffffffff811b4070>] ? vfsmount_lock_local_unlock+0x50/0x80 [<ffffffff811a6a1a>] do_vfs_ioctl+0x9a/0x5e0 [<ffffffff811b476f>] ? mntput+0x1f/0x30 [<ffffffff81196259>] ? fput+0x199/0x240 [<ffffffff811a7001>] sys_ioctl+0xa1/0xb0 [<ffffffff8164ea82>] system_call_fastpath+0x16/0x1b Reported-by: Jim Burns <jim_burn@bellsouth.net> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-01xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernelOlaf Hering2-1/+124
After triggering a crash dump in a HVM guest, the PV backend drivers will remain in Connected state. When the kdump kernel starts the PV drivers will skip such devices. As a result, no root device is found and the vmcore cant be saved. A similar situation happens after a kexec boot, here the devices will be in the Closed state. With this change all frontend devices with state XenbusStateConnected or XenbusStateClosed will be reset by changing the state file to Closing -> Closed -> Initializing. This will trigger a disconnect in the backend drivers. Now the frontend drivers will find the backend drivers in state Initwait and can connect. Signed-off-by: Olaf Hering <olaf@aepfle.de> [v2: - add timeout when waiting for backend state change (based on feedback from Ian Campell) - extent printk message to include backend string - add comment to fall-through case in xenbus_reset_frontend] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-01xen/pv-on-hvm kexec: rebind virqs to existing eventchannel portsOlaf Hering1-5/+32
During a kexec boot some virqs such as timer and debugirq were already registered by the old kernel. The hypervisor will return -EEXISTS from the new EVTCHNOP_bind_virq request and the BUG in bind_virq_to_irq() triggers. Catch the -EEXISTS error and loop through all possible ports to find what port belongs to the virq/cpu combo. Signed-off-by: Olaf Hering <olaf@aepfle.de> [v2: - use NR_EVENT_CHANNELS instead of private MAX_EVTCHNS] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-09-01xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch ↵Olaf Hering1-2/+1
events arrive During repeated kexec boots xenwatch_thread() can crash because xenbus_watch->callback is cleared by xenbus_watch_path() if a node/token combo for a new watch happens to match an already registered watch from an old kernel. In this case xs_watch returns -EEXISTS, then register_xenbus_watch() does not remove the to-be-registered watch from the list of active watches but returns the -EEXISTS to the caller anyway. Because the watch is still active in xenstored it will cause an event which will arrive in the new kernel. process_msg() will find the encapsulated struct xenbus_watch in its list of registered watches and puts the "empty" watch handle in the queue for xenwatch_thread(). xenwatch_thread() then calls ->callback which was cleared earlier by xenbus_watch_path(). To prevent that crash in a guest running on an old xen toolstack remove the special -EEXIST handling. v2: - remove the EEXIST handing in register_xenbus_watch() instead of checking for ->callback in process_msg() Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Olaf Hering <olaf@aepfle.de>
2011-08-26xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.Konrad Rzeszutek Wilk1-4/+24
The process to swizzle a Machine Frame Number (MFN) is not always necessary. Especially if we know that we actually do not have to do it. In this patch we check the MFN against the device's coherent DMA mask and if the requested page(s) are contingous. If it all checks out we will just return the bus addr without doing the memory swizzle. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>