summaryrefslogtreecommitdiff
path: root/block/qcow2.c
AgeCommit message (Collapse)AuthorFilesLines
2013-05-14qcow2: Catch some L1 table index overflowsKevin Wolf1-2/+11
This catches the situation that is described in the bug report at https://bugs.launchpad.net/qemu/+bug/865518 and goes like this: $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512" Segmentation fault With this patch applied the segfault will be avoided, however the case will still fail, though gracefully: $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off qemu-img: The image size is too large for file format 'qcow2' Note that even long before these overflow checks kick in, you get insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for the L1 table), so with somewhat smaller image sizes you'll probably see qemu aborting for a failed g_malloc(). If you need huge image sizes, you should increase the cluster size to the maximum of 2 MB in order to get higher limits. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-22qcow2: allow sub-cluster compressed write to last clusterStefan Hajnoczi1-2/+15
Compression in qcow2 requires image length to be a multiple of the cluster size. Lift this requirement by zero-padding the final cluster when necessary. The virtual disk size is still not cluster-aligned, so the guest cannot access the zero sectors. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-15block: Introduce bdrv_pwritev() for qcow2_save_vmstateKevin Wolf1-7/+1
Directly pass the QEMUIOVector on instead of linearising it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15block: Introduce bdrv_writev_vmstateKevin Wolf1-3/+9
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-13aes: move aes.h from include/block to include/qemuAurelien Jarno1-1/+1
Move aes.h from include/block to include/qemu to show it can be reused by other subsystems. Cc: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-03-28qcow2: Allow requests with multiple l2metasKevin Wolf1-3/+11
Instead of expecting a single l2meta, have a list of them. This allows to still have a single I/O request for the guest data, even though multiple l2meta may be needed in order to describe both a COW overwrite and a new cluster allocation (typical sequential write case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28qcow2: Remove bogus unlock of s->lockKevin Wolf1-2/+0
The unlock wakes up the next coroutine, but the currently running coroutine will lock it again before it yields, so this doesn't make a lot of sense. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-22block: Add options QDict to bdrv_file_open() prototypesKevin Wolf1-1/+1
The new parameter is unused yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-19qcow2: Fix segfault in qcow2_invalidate_cacheKevin Wolf1-2/+10
Need to pass an options QDict to qcow2_open() now. This fixes a segfault on the migration target with qcow2. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-15qcow2: make is_allocated return true for zero clustersPaolo Bonzini1-5/+1
Otherwise, live migration of the top layer will miss zero clusters and let the backing file show through. This also matches what is done in qed. QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this directly in qcow2_get_cluster_offset instead of replicating the test everywhere. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15qcow2: Allow lazy refcounts to be enabled on the command lineKevin Wolf1-0/+37
qcow2 images now accept a boolean lazy_refcounts options. Use it like this: -drive file=test.qcow2,lazy_refcounts=on If the option is specified on the command line, it overrides the default specified by the qcow2 header flags that were set when creating the image. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15block: Add options QDict to bdrv_open() prototypeKevin Wolf1-1/+1
It doesn't do anything yet except storing the options QDict in the BlockDriverState. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15block: Add options QDict to .bdrv_open()Kevin Wolf1-2/+2
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-25block: Use error code EMEDIUMTYPE for wrong format in some block driversStefan Weil1-1/+1
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk when a file with the wrong format is selected. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-15qcow2: Fix segfault on zero-length writeKevin Wolf1-1/+1
One of the recent refactoring patches (commit f50f88b9) didn't take care to initialise l2meta properly, so with zero-length writes, which don't even enter the write loop, qemu just segfaulted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-12-19misc: move include files to include/qemu/Paolo Bonzini1-2/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19block: move include files to include/block/Paolo Bonzini1-2/+2
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19qapi: move include files to include/qobject/Paolo Bonzini1-1/+1
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-13qcow2: Execute run_dependent_requests() without lockKevin Wolf1-20/+16
There's no reason for run_dependent_requests() to hold s->lock, and a later patch will require that in fact the lock is not held. Also, before this patch, run_dependent_requests() not only does what its name suggests, but also removes the l2meta from the list of in-flight requests. When changing this, it becomes an one-liner, so just inline it completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2Kevin Wolf1-6/+1
This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Allocate l2meta only for cluster allocationsKevin Wolf1-15/+17
Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Drop l2meta.cluster_offsetKevin Wolf1-7/+7
There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13qcow2: Allocate l2meta dynamicallyKevin Wolf1-11/+15
As soon as delayed COW is introduced, the l2meta struct is needed even after completion of the request, so it can't live on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12qcow2: Move BLKDBG_EVENT out of the lockKevin Wolf1-1/+1
We want to use these events to suspend requests for testing concurrent AIO requests. Suspending requests while they are holding the CoMutex is rather boring for this purpose. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-10-05qcow2: mark this file's sole strncpy use as justifiedJim Meyering1-0/+1
Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-24block: qcow2 image file reopenJeff Cody1-0/+10
These are the stubs for the file reopen drivers for the qcow2 format. There is currently nothing that needs to be done by the qcow2 driver in reopen. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10block: add BLOCK_O_CHECK for qemu-img checkStefan Hajnoczi1-2/+2
Image formats with a dirty bit, like qed and qcow2, repair dirty image files upon open with BDRV_O_RDWR. Performing automatic repair when qemu-img check runs is not ideal because the bdrv_open() call repairs the image before the actual bdrv_check() call from qemu-img.c. Fix this "double repair" since it leads to confusing output from qemu-img check. Tell the block driver that this image is being opened just for bdrv_check(). This skips automatic repair and qemu-img.c can invoke it manually with bdrv_check(). Update the golden output for qemu-iotests 039 to reflect the new qemu-img check output. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10qcow2: mark image clean after repair succeedsStefan Hajnoczi1-13/+15
The dirty bit is cleared after image repair succeeds in qcow2_open(). Move this into qcow2_check() so that all callers benefit from this behavior when fix mode is enabled. This is necessary so qemu-img check can call .bdrv_check() and mark the image clean. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06qcow2: implement lazy refcountsStefan Hajnoczi1-4/+69
Lazy refcounts is a performance optimization for qcow2 that postpones refcount metadata updates and instead marks the image dirty. In the case of crash or power failure the image will be left in a dirty state and repaired next time it is opened. Reducing metadata I/O is important for cache=writethrough and cache=directsync because these modes guarantee that data is on disk after each write (hence we cannot take advantage of caching updates in RAM). Refcount metadata is not needed for guest->file block address translation and therefore does not need to be on-disk at the time of write completion - this is the motivation behind the lazy refcount optimization. The lazy refcount optimization must be enabled at image creation time: qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06qcow2: introduce dirty bitStefan Hajnoczi1-3/+47
This patch adds an incompatible feature bit to mark images that have not been closed cleanly. When a dirty image file is opened a consistency check and repair is performed. Update qemu-iotests 031 and 036 since the extension header size changes when we add feature bit table entries. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09Merge remote-tracking branch 'mjt/mjt-iov2' into stagingAnthony Liguori1-12/+9
* mjt/mjt-iov2: rewrite iov_send_recv() and move it to iov.c cleanup qemu_co_sendv(), qemu_co_recvv() and friends export iov_send_recv() and use it in iov_send() and iov_recv() rename qemu_sendv to iov_send, change proto and move declarations to iov.h change qemu_iovec_to_buf() to match other to,from_buf functions consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent allow qemu_iovec_from_buffer() to specify offset from which to start copying consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset() rewrite iov_* functions change iov_* function prototypes to be more appropriate virtio-serial-bus: use correct lengths in control_out() message Conflicts: tests/Makefile Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09qcow2: fix #ifdef'd qcow2_check_refcounts() callersStefan Hajnoczi1-1/+1
The DEBUG_ALLOC qcow2.h macro enables additional consistency checks throughout the code. This makes it easier to spot corruptions that are introduced during development. Since consistency check is an expensive operation the DEBUG_ALLOC macro is used to compile checks out in normal builds and qcow2_check_refcounts() calls missed the addition of a new function argument. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: fix autoclear image header updateStefan Hajnoczi1-8/+9
The autoclear feature bits can be used for qcow2 file format features that are safe to "drop" by old programs that do not understand the feature. Upon opening the image file unknown autoclear feature bits are cleared and the image file header is rewritten, but this was happening too early in the code when critical header fields were not yet loaded. Process autoclear feature bits after all necessary header information has been loaded. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: always operate caches in writeback modePaolo Bonzini1-5/+2
Writethrough does not need special-casing anymore in the qcow2 caches. The block layer adds flushes after every guest-initiated data write, and these will also flush the qcow2 caches to the OS. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qcow2: Support for fixing refcount inconsistenciesKevin Wolf1-5/+1
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15qemu-img check -r for repairing imagesKevin Wolf1-1/+6
The QED block driver already provides the functionality to not only detect inconsistencies in images, but also fix them. However, this functionality cannot be manually invoked with qemu-img, but the check happens only automatically during bdrv_open(). This adds a -r switch to qemu-img check that allows manual invocation of an image repair. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-11change qemu_iovec_to_buf() to match other to,from_buf functionsMichael Tokarev1-1/+1
It now allows specifying offset within qiov to start from and amount of bytes to copy. Actual implementation is just a call to iov_to_buf(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistentMichael Tokarev1-2/+2
qemu_iovec_concat() is currently a wrapper for qemu_iovec_copy(), use the former (with extra "0" arg) in a few places where it is used. Change skip argument of qemu_iovec_copy() from uint64_t to size_t, since size of qiov itself is size_t, so there's no way to skip larger sizes. Rename it to soffset, to make it clear that the offset is applied to src. Also change the only usage of uint64_t in hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() - all callers of it actually uses size_t too, not uint64_t. One added restriction: as for all other iovec-related functions, soffset must point inside src. Order of argumens is already good: qemu_iovec_memset(QEMUIOVector *qiov, size_t offset, int c, size_t bytes) vs: qemu_iovec_concat(QEMUIOVector *dst, QEMUIOVector *src, size_t soffset, size_t sbytes) (note soffset is after _src_ not dst, since it applies to src; for memset it applies to qiov). Note that in many places where this function is used, the previous call is qemu_iovec_reset(), which means many callers actually want copy (replacing dst content), not concat. So we may want to add a wrapper like qemu_iovec_copy() with the same arguments but which calls qemu_iovec_reset() before _concat(). Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11allow qemu_iovec_from_buffer() to specify offset from which to start copyingMichael Tokarev1-6/+3
Similar to qemu_iovec_memset(QEMUIOVector *qiov, size_t offset, int c, size_t bytes); the new prototype is: qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset, const void *buf, size_t bytes); The processing starts at offset bytes within qiov. This way, we may copy a bounce buffer directly to a middle of qiov. This is exactly the same function as iov_from_buf() from iov.c, so use the existing implementation and rename it to qemu_iovec_from_buf() to be shorter and to match the utility function. As with utility implementation, we now assert that the offset is inside actual iovec. Nothing changed for current callers, because `offset' parameter is new. While at it, stop using "bounce-qiov" in block/qcow2.c and copy decrypted data directly from cluster_data instead of recreating a temp qiov for doing that. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11consolidate qemu_iovec_memset{,_skip}() into single function and use ↵Michael Tokarev1-3/+3
existing iov_memset() This patch combines two functions into one, and replaces the implementation with already existing iov_memset() from iov.c. The new prototype of qemu_iovec_memset(): size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes) It is different from former qemu_iovec_memset_skip(), and I want to make other functions to be consistent with it too: first how much to skip, second what, and 3rd how many of it. It also returns actual number of bytes filled in, which may be less than the requested `bytes' if qiov is smaller than offset+bytes, in the same way iov_memset() does. While at it, use utility function iov_memset() from iov.h in posix-aio-compat.c, where qiov was used. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-05-25qcow2: don't leak buffer for unexpected qcow_version in headerJim Meyering1-1/+2
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14qcow2: Don't ignore failure to clear autoclear flagsKevin Wolf1-1/+4
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10block: push bdrv_change_backing_file error checking up from driversPaolo Bonzini1-5/+0
This check applies to all drivers, but QED lacks it. Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07qcow2: lock on preallocZhi Yong Wu1-0/+3
preallocate() will be locked. This is required because qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it can drop while COW is being performed. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()Stefan Weil1-1/+2
Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Zero write supportKevin Wolf1-0/+21
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Support for feature table header extensionKevin Wolf1-9/+57
Instead of printing an ugly bitmask, qemu can now print a more helpful string even for yet unknown features. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Support reading zero clustersKevin Wolf1-0/+8
This adds support for reading zero clusters in version 3 images. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Version 3 imagesKevin Wolf1-14/+132
This adds the basic infrastructure to qcow2 to handle version 3 images. It includes code to create v3 images, allow header updates for v3 images and checks feature bits. It still misses support for zero clusters, so this is not a fully compliant implementation of v3 yet. The default for creating new images stays at v2 for now. Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20qcow2: Ignore reserved bits in get_cluster_offsetKevin Wolf1-3/+14
With this change, reading from a qcow2 image ignores all reserved bits that are set in an L1 or L2 table entry. Now get_cluster_offset() assigns *cluster_offset only the offset without any other flags. The cluster type is not longer encoded in the offset, but a positive return value in case of success. Signed-off-by: Kevin Wolf <kwolf@redhat.com>