summaryrefslogtreecommitdiff
path: root/posix-aio-compat.c
AgeCommit message (Collapse)AuthorFilesLines
2011-11-11posix-aio-compat: Plug memory leak on paio_init() error pathMarkus Armbruster1-0/+1
Spotted by Coverity. Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-09-20block: avoid SIGUSR2Frediano Ziglio1-24/+9
Now that iothread is always compiled sending a signal seems only an additional step. This patch also avoid writing to two pipe (one from signal and one in qemu_service_io). Work with kvm enabled or disabled. strace output is more readable (less syscalls). [ kwolf: Merged build fix by Paolo Bonzini ] Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-20posix-aio-compat: Removed unused offset variableFrediano Ziglio1-3/+2
Signed-off-by: Frediano Ziglio <freddy77@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23posix-aio-compat: fix latency issuesAvi Kivity1-2/+42
In certain circumstances, posix-aio-compat can incur a lot of latency: - threads are created by vcpu threads, so if vcpu affinity is set, aio threads inherit vcpu affinity. This can cause many aio threads to compete for one cpu. - we can create up to max_threads (64) aio threads in one go; since a pthread_create can take around 30μs, we have up to 2ms of cpu time under a global lock. Fix by: - moving thread creation to the main thread, so we inherit the main thread's affinity instead of the vcpu thread's affinity. - if a thread is currently being created, and we need to create yet another thread, let thread being born create the new thread, reducing the amount of time we spend under the main thread. - drop the local lock while creating a thread (we may still hold the global mutex, though) Note this doesn't eliminate latency completely; scheduler artifacts or lack of host cpu resources can still cause it. We may want pre-allocated threads when this cannot be tolerated. Thanks to Uli Obergfell of Red Hat for his excellent analysis and suggestions. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-20Use glib memory allocation and free functionsAnthony Liguori1-1/+1
qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-02posix-aio-compat: Allow read after EOFKevin Wolf1-0/+19
In order to be able to transparently replace bdrv_read calls by bdrv_co_read, reading beyond EOF must produce zeros instead of short reads for AIO, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02async: Remove AsyncContextKevin Wolf1-11/+0
The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and can run AIO callbacks of different requests if it weren't for AsyncContexts). Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08Fix compilation warning due to missing header for sigaction (followup)Alexandre Raymond1-1/+0
This patch removes all references to signal.h when qemu-common.h is included as they become redundant. Signed-off-by: Alexandre Raymond <cerbere@gmail.com> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-05-18posix-aio-compat: Fix idle_threads counterKevin Wolf1-4/+2
A thread should only be counted as idle when it really is waiting for new requests. Without this patch, sometimes too few threads are started as busy threads are counted as idle. Not sure if it makes a difference in practice outside some artificial qemu-io/qemu-img tests, but I think the change makes sense in any case. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-03-07trace: Trace posix-aio-compat.c completion and cancellationStefan Hajnoczi1-0/+5
This patch adds paio_complete() and paio_cancel() trace events to complement the paio_submit() event. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-10-30Move qemu_gettimeofday() to OS specific filesJes Sorensen1-0/+1
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-10-05Fix spelling in commentsStefan Weil1-1/+1
multifuction -> multifunction successfull -> successful. Signed-off-by: Stefan Weil <weil@mail.berlios.de>
2010-09-21use qemu_blockalign consistentlyChristoph Hellwig1-1/+1
Use qemu_blockalign for all allocations in the block layer. This allows increasing the required alignment, which is need to support O_DIRECT on devices with large block sizes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-09trace: Trace virtio-blk, multiwrite, and paio_submitStefan Hajnoczi1-0/+2
This patch adds trace events that make it possible to observe virtio-blk. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-08-30posix-aio-compat: Fix async_conmtext for ioctlAndrew de Quincey1-0/+1
Set the async_context_id field when queuing an async ioctl call Signed-off-by: Andrew de Quincey <adq@lidskialf.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28posix-aio-compat: Expand tabs that have crept inStefan Hajnoczi1-29/+29
This patch expands tabs on a few lines so the code formats nicely and follows the QEMU coding style. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-01-26posix-aio-compat.c: fix warning with _FORTIFY_SOURCEKirill A. Shutemov1-1/+4
CC posix-aio-compat.o cc1: warnings being treated as errors posix-aio-compat.c: In function 'aio_signal_handler': posix-aio-compat.c:505: error: ignoring return value of 'write', declared with attribute warn_unused_result make: *** [posix-aio-compat.o] Error 1 Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03posix-aio-compat: Fix error checkKevin Wolf1-9/+9
Checking for nbytes < 0 is pointless as long as it's a size_t. If we want to use negative numbers for error codes, we should use signed types. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03Don't leak file descriptorsKevin Wolf1-1/+1
We're leaking file descriptors to child processes. Set FD_CLOEXEC on file descriptors that don't need to be passed to children to stop this misbehaviour. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-30Remove aio_ctx from paio_* interfaceKevin Wolf1-6/+5
The context parameter in paio_submit isn't used anyway, so there is no reason why block drivers should need to remember it. This also avoids passing a Linux AIO context to paio_submit (which doesn't do any harm as long as the parameter is unused, but it is highly confusing). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27posix-aio-compat: Honour AsyncContextKevin Wolf1-0/+12
Don't call callbacks that don't belong to the active AsyncContext. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27Add qemu_aio_process_queue()Kevin Wolf1-1/+2
We'll leave some AIO completions unhandled when we can't call the callback. qemu_aio_process_queue() is used later to run any callbacks that are left and can be run then. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27posix-aio-compat: Split out posix_aio_process_queueKevin Wolf1-16/+27
We need to process the request queue and run callbacks separately from reading out the queue in a later patch, so split it out. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-27posix-aio-compat: avoid signal race when spawning a threadmalc1-5/+9
Signed-off-by: malc <av1474@comtv.ru>
2009-09-12Fix sys-queue.h conflict for goodBlue Swirl1-10/+10
Problem: Our file sys-queue.h is a copy of the BSD file, but there are some additions and it's not entirely compatible. Because of that, there have been conflicts with system headers on BSD systems. Some hacks have been introduced in the commits 15cc9235840a22c289edbe064a9b3c19c5f49896, f40d753718c72693c5f520f0d9899f6e50395e94, 96555a96d724016e13190b28cffa3bc929ac60dc and 3990d09adf4463eca200ad964cc55643c33feb50 but the fixes were fragile. Solution: Avoid the conflict entirely by renaming the functions and the file. Revert the previous hacks. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-12Unbreak BSD: use qemu_fdatasync instead of fdatasyncBlue Swirl1-1/+1
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-11block: add aio_flush operationChristoph Hellwig1-2/+17
Instead stalling the VCPU while serving a cache flush try to do it asynchronously. Use our good old helper thread pool to issue an asynchronous fdatasync for raw-posix. Note that while Linux AIO implements a fdatasync operation it is not useful for us because it isn't actually implement in asynchronous fashion. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27raw-posix: refactor AIO supportChristoph Hellwig1-73/+252
Currently the raw-posix.c code contains a lot of knowledge about the asynchronous I/O scheme that is mostly implemented in posix-aio-compat.c. All this code does not really belong here and is getting a bit in the way of implementing native AIO on Linux. So instead move all the guts of the AIO implementation into posix-aio-compat.c (which might need a better name, btw). There's now a very small interface between the AIO providers and raw-posix.c: - an init routine is called from raw_open_common to return an AIO context for this drive. An AIO implementation may either re-use one context for all drives, or use a different one for each as the Linux native AIO support will do. - an submit routine is called from the aio_reav/writev methods to submit an AIO request There are no indirect calls involved in this interface as we need to decide which one to call manually. We will only call the Linux AIO native init function if we were requested to by vl.c, and we will only call the native submit function if we are asked to and the request is properly aligned. That's also the reason why the alignment check actually does the inverse move and now goes into raw-posix.c. The old posix-aio-compat.h headers is removed now that most of it's content is private to posix-aio-compat.c, and instead we add a new block/raw-posix-aio.h headers is created containing only the tiny interface between raw-posix.c and the AIO implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-27rename HAVE_PREADV to CONFIG_PREADVJuan Quintela1-2/+2
Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-08fix asynchronous ioctlsChristoph Hellwig1-1/+10
posix_aio_read expect aio requests to return the number of bytes requests to be successfull, so we need to fake this up for ioctls. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-04-07native preadv/pwritev support (Christoph Hellwig)aliguori1-2/+83
This ties up the preadv/pwritev syscalls to qemu if they are declared in unistd.h. This is the case currently on at least NetBSD and OpenBSD and will hopefully soon be the case on Linux. Thanks to Blue Swirl and Gerd Hoffmann for the configure autodetection of preadv/pwritev. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7021 c046a42c-6fe2-441c-8c8c-71466251a162
2009-04-07push down vector linearization to posix-aio-compat.c (Christoph Hellwig)aliguori1-26/+92
Make all AIO requests vectored and defer linearization until the actual I/O thread. This prepares for using native preadv/pwritev. Also enables asynchronous direct I/O by handling that case in the I/O thread. Qcow and qcow2 propably want to be adopted to directly deal with multi-segment requests, but that can be implemented later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@7020 c046a42c-6fe2-441c-8c8c-71466251a162
2009-03-28new scsi-generic abstraction, use SG_IO (Christoph Hellwig)aliguori1-34/+66
Okay, I started looking into how to handle scsi-generic I/O in the new world order. I think the best is to use the SG_IO ioctl instead of the read/write interface as that allows us to support scsi passthrough on disk/cdrom devices, too. See Hannes patch on the kvm list from August for an example. Now that we always do ioctls we don't need another abstraction than bdrv_ioctl for the synchronous requests for now, and for asynchronous requests I've added a aio_ioctl abstraction keeping it simple. Long-term we might want to move the ops to a higher-level abstraction and let the low-level code fill out the request header, but I'm lazy enough to leave that to the people trying to support scsi-passthrough on a non-Linux OS. Tested lightly by issuing various sg_ commands from sg3-utils in a guest to a host CDROM device. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6895 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21Properly handle pthread_cond_timedwait timing outmalc1-1/+1
pthread_cond_timedwait is allowed to both consume the signal and return with the value indicating the timeout, hence predicate should always be (re)checked before taking an action git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6634 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21Cosmeticsmalc1-11/+13
Avoid repeated creation/initalization/destruction of attr and calls to getpid git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6633 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21Avoid thundering herd problemmalc1-4/+4
Broadcast was used so that the I/O threads would wakeup, reset their ts values and all but one go to sleep, in other words an optimization to prevent threads from exiting in presence of continuing I/O activity. Spurious wakeups make the looping around cond_timedwait with ever reinitialized ts potentially unsafe and as such ts in no longer reinitilized inside the loop, hence switch to signal is warranted and this benefits of this particlaur optimization are lost. (It's worth noting that timed variants of pthread calls use realtime clock by default, and therefore can hang "forever" should the host time be changed. Unfortunatelly not all host systems QEMU runs on support CLOCK_MONOTONIC and/or pthread_condattr_setclock with this value) git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6632 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21Avoid infinite loop around timed condition variablemalc1-6/+7
This can happen due to spurious wakeups git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6631 c046a42c-6fe2-441c-8c8c-71466251a162
2009-02-21Error checkingmalc1-24/+72
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6630 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-24Rename sigev_signo to avoid FreeBSD problems (Juergen Lock)blueswir11-1/+1
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6414 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-17Use kill instead of sigqueue: re-enables AIO on OpenBSDblueswir11-3/+1
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6360 c046a42c-6fe2-441c-8c8c-71466251a162
2009-01-13Fix race in POSIX AIO emulation (Jan Kiszka)aliguori1-7/+2
When we cancel an AIO request that is already being processed by aio_thread, qemu_paio_cancel should return QEMU_PAIO_NOTCANCELED as long as aio_thread isn't done with this request. But as the latter currently updates aiocb->ret after every block of the request, we may report QEMU_PAIO_ALLDONE too early. Futhermore, in case some zero-length request should have been queued, aiocb->ret is never set to != -EINPROGRESS and callers like raw_aio_cancel could get stuck in an endless loop. Fix those issues by updating aiocb->ret _after_ the request has been fully processed. This also simplifies the locking. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6278 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-13Remove unnecessary trailing newlinesblueswir11-1/+0
git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6000 c046a42c-6fe2-441c-8c8c-71466251a162
2008-12-12Replace posix-aio with custom thread poolaliguori1-0/+202
glibc implements posix-aio as a thread pool and imposes a number of limitations. 1) it limits one request per-file descriptor. we hack around this by dup()'ing file descriptors which is hideously ugly 2) it's impossible to add new interfaces and we need a vectored read/write operation to properly support a zero-copy API. What has been suggested to me by glibc folks, is to implement whatever new interfaces we want and then it can eventually be proposed for standardization. This requires that we implement our own posix-aio implementation though. This patch implements posix-aio using pthreads. It immediately eliminates the need for fd pooling. It performs at least as well as the current posix-aio code (in some circumstances, even better). Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@5996 c046a42c-6fe2-441c-8c8c-71466251a162