diff options
author | SeokYeon Hwang <syeon.hwang@samsung.com> | 2016-12-20 10:13:15 +0900 |
---|---|---|
committer | SeokYeon Hwang <syeon.hwang@samsung.com> | 2016-12-20 10:13:15 +0900 |
commit | dc36664b156b6aa2b55f2bca5fd0c643b6417ddb (patch) | |
tree | bb319daf3cd759c2d91dd541bb2ee24d8ca4ee1a /docs | |
parent | 100d9fdc18f28d813f9d22025d783a7cdcc4bb4b (diff) | |
parent | 6a928d25b6d8bc3729c3d28326c6db13b9481059 (diff) | |
download | qemu-dc36664b156b6aa2b55f2bca5fd0c643b6417ddb.tar.gz qemu-dc36664b156b6aa2b55f2bca5fd0c643b6417ddb.tar.bz2 qemu-dc36664b156b6aa2b55f2bca5fd0c643b6417ddb.zip |
Merge tag 'v2.8.0-rc4' into develop
v2.8.0-rc4 release
Change-Id: I0158b5078d1af545dc32a51f10d2f8f0b96543a6
Signed-off-by: SeokYeon Hwang <syeon.hwang@samsung.com>
Diffstat (limited to 'docs')
-rw-r--r-- | docs/COLO-FT.txt | 191 | ||||
-rw-r--r-- | docs/atomics.txt | 84 | ||||
-rw-r--r-- | docs/block-replication.txt | 239 | ||||
-rw-r--r-- | docs/colo-proxy.txt | 188 | ||||
-rw-r--r-- | docs/generic-loader.txt | 92 | ||||
-rw-r--r-- | docs/live-block-ops.txt | 36 | ||||
-rw-r--r-- | docs/multiple-iothreads.txt | 40 | ||||
-rw-r--r-- | docs/pcie.txt | 310 | ||||
-rw-r--r-- | docs/qapi-code-gen.txt | 10 | ||||
-rw-r--r-- | docs/qcow2-cache.txt | 5 | ||||
-rw-r--r-- | docs/qmp-commands.txt | 3838 | ||||
-rw-r--r-- | docs/qmp-events.txt | 14 | ||||
-rw-r--r-- | docs/rcu.txt | 4 | ||||
-rw-r--r-- | docs/specs/acpi_nvdimm.txt | 69 | ||||
-rw-r--r-- | docs/specs/edu.txt | 7 | ||||
-rw-r--r-- | docs/specs/ppc-spapr-hotplug.txt | 55 | ||||
-rw-r--r-- | docs/specs/vhost-user.txt | 20 | ||||
-rw-r--r-- | docs/tcg-exclusive.promela | 225 | ||||
-rw-r--r-- | docs/throttle.txt | 5 | ||||
-rw-r--r-- | docs/tracing.txt | 19 | ||||
-rw-r--r-- | docs/writing-qmp-commands.txt | 50 | ||||
-rw-r--r-- | docs/xbzrle.txt | 4 | ||||
-rw-r--r-- | docs/xen-save-devices-state.txt | 2 |
23 files changed, 5355 insertions, 152 deletions
diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt new file mode 100644 index 0000000000..e289be2f41 --- /dev/null +++ b/docs/COLO-FT.txt @@ -0,0 +1,191 @@ +COarse-grained LOck-stepping Virtual Machines for Non-stop Service +---------------------------------------- +Copyright (c) 2016 Intel Corporation +Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. +Copyright (c) 2016 Fujitsu, Corp. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +This document gives an overview of COLO's design and how to use it. + +== Background == +Virtual machine (VM) replication is a well known technique for providing +application-agnostic software-implemented hardware fault tolerance, +also known as "non-stop service". + +COLO (COarse-grained LOck-stepping) is a high availability solution. +Both primary VM (PVM) and secondary VM (SVM) run in parallel. They receive the +same request from client, and generate response in parallel too. +If the response packets from PVM and SVM are identical, they are released +immediately. Otherwise, a VM checkpoint (on demand) is conducted. + +== Architecture == + +The architecture of COLO is shown in the diagram below. +It consists of a pair of networked physical nodes: +The primary node running the PVM, and the secondary node running the SVM +to maintain a valid replica of the PVM. +PVM and SVM execute in parallel and generate output of response packets for +client requests according to the application semantics. + +The incoming packets from the client or external network are received by the +primary node, and then forwarded to the secondary node, so that both the PVM +and the SVM are stimulated with the same requests. + +COLO receives the outbound packets from both the PVM and SVM and compares them +before allowing the output to be sent to clients. + +The SVM is qualified as a valid replica of the PVM, as long as it generates +identical responses to all client requests. Once the differences in the outputs +are detected between the PVM and SVM, COLO withholds transmission of the +outbound packets until it has successfully synchronized the PVM state to the SVM. + + Primary Node Secondary Node ++------------+ +-----------------------+ +------------------------+ +------------+ +| | | HeartBeat +<----->+ HeartBeat | | | +| Primary VM | +-----------+-----------+ +-----------+------------+ |Secondary VM| +| | | | | | +| | +-----------|-----------+ +-----------|------------+ | | +| | |QEMU +---v----+ | |QEMU +----v---+ | | | +| | | |Failover| | | |Failover| | | | +| | | +--------+ | | +--------+ | | | +| | | +---------------+ | | +---------------+ | | | +| | | | VM Checkpoint +-------------->+ VM Checkpoint | | | | +| | | +---------------+ | | +---------------+ | | | +|Requests<--------------------------\ /-----------------\ /--------------------->Requests| +| | | ^ ^ | | | | | | | +|Responses+---------------------\ /-|-|------------\ /-------------------------+Responses| +| | | | | | | | | | | | | | | | +| | | +-----------+ | | | | | | | | | | +----------+ | | | +| | | | COLO disk | | | | | | | | | | | | COLO disk| | | | +| | | | Manager +---------------------------->| Manager | | | | +| | | ++----------+ v v | | | | | v v | +---------++ | | | +| | | |+-----------+-+-+-++| | ++-+--+-+---------+ | | | | +| | | || COLO Proxy || | | COLO Proxy | | | | | +| | | || (compare packet || | |(adjust sequence | | | | | +| | | ||and mirror packet)|| | | and ACK) | | | | | +| | | |+------------+---+-+| | +-----------------+ | | | | ++------------+ +-----------------------+ +------------------------+ +------------+ ++------------+ | | | | +------------+ +| VM Monitor | | | | | | VM Monitor | ++------------+ | | | | +------------+ ++---------------------------------------+ +----------------------------------------+ +| Kernel | | | | | Kernel | | ++---------------------------------------+ +----------------------------------------+ + | | | | + +--------------v+ +---------v---+--+ +------------------+ +v-------------+ + | Storage | |External Network| | External Network | | Storage | + +---------------+ +----------------+ +------------------+ +--------------+ + + +== Components introduction == + +You can see there are several components in COLO's diagram of architecture. +Their functions are described below. + +HeartBeat: +Runs on both the primary and secondary nodes, to periodically check platform +availability. When the primary node suffers a hardware fail-stop failure, +the heartbeat stops responding, the secondary node will trigger a failover +as soon as it determines the absence. + +COLO disk Manager: +When primary VM writes data into image, the colo disk manger captures this data +and sends it to secondary VM's which makes sure the context of secondary VM's +image is consistent with the context of primary VM 's image. +For more details, please refer to docs/block-replication.txt. + +Checkpoint/Failover Controller: +Modifications of save/restore flow to realize continuous migration, +to make sure the state of VM in Secondary side is always consistent with VM in +Primary side. + +COLO Proxy: +Delivers packets to Primary and Seconday, and then compare the responses from +both side. Then decide whether to start a checkpoint according to some rules. +Please refer to docs/colo-proxy.txt for more informations. + +Note: +HeartBeat has not been implemented yet, so you need to trigger failover process +by using 'x-colo-lost-heartbeat' command. + +== Test procedure == +1. Startup qemu +Primary: +# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name primary \ + -device piix3-usb-uhci \ + -device usb-tablet -netdev tap,id=hn0,vhost=off \ + -device virtio-net-pci,id=net-pci0,netdev=hn0 \ + -drive if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\ + children.0.file.filename=1.raw,\ + children.0.driver=raw -S +Secondary: +# qemu-kvm -enable-kvm -m 2048 -smp 2 -qmp stdio -vnc :7 -name secondary \ + -device piix3-usb-uhci \ + -device usb-tablet -netdev tap,id=hn0,vhost=off \ + -device virtio-net-pci,id=net-pci0,netdev=hn0 \ + -drive if=none,id=secondary-disk0,file.filename=1.raw,driver=raw,node-name=node0 \ + -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\ + file.driver=qcow2,top-id=active-disk0,\ + file.file.filename=/mnt/ramfs/active_disk.img,\ + file.backing.driver=qcow2,\ + file.backing.file.filename=/mnt/ramfs/hidden_disk.img,\ + file.backing.backing=secondary-disk0 \ + -incoming tcp:0:8888 + +2. On Secondary VM's QEMU monitor, issue command +{'execute':'qmp_capabilities'} +{ 'execute': 'nbd-server-start', + 'arguments': {'addr': {'type': 'inet', 'data': {'host': 'xx.xx.xx.xx', 'port': '8889'} } } +} +{'execute': 'nbd-server-add', 'arguments': {'device': 'secondeary-disk0', 'writable': true } } + +Note: + a. The qmp command nbd-server-start and nbd-server-add must be run + before running the qmp command migrate on primary QEMU + b. Active disk, hidden disk and nbd target's length should be the + same. + c. It is better to put active disk and hidden disk in ramdisk. + +3. On Primary VM's QEMU monitor, issue command: +{'execute':'qmp_capabilities'} +{ 'execute': 'human-monitor-command', + 'arguments': {'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xx.xx.xx.xx,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0'}} +{ 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 'node': 'nbd_client0' } } +{ 'execute': 'migrate-set-capabilities', + 'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true } ] } } +{ 'execute': 'migrate', 'arguments': {'uri': 'tcp:xx.xx.xx.xx:8888' } } + + Note: + a. There should be only one NBD Client for each primary disk. + b. xx.xx.xx.xx is the secondary physical machine's hostname or IP + c. The qmp command line must be run after running qmp command line in + secondary qemu. + +4. After the above steps, you will see, whenever you make changes to PVM, SVM will be synced. +You can issue command '{ "execute": "migrate-set-parameters" , "arguments":{ "x-checkpoint-delay": 2000 } }' +to change the checkpoint period time + +5. Failover test +You can kill Primary VM and run 'x_colo_lost_heartbeat' in Secondary VM's +monitor at the same time, then SVM will failover and client will not detect this +change. + +Before issuing '{ "execute": "x-colo-lost-heartbeat" }' command, we have to +issue block related command to stop block replication. +Primary: + Remove the nbd child from the quorum: + { 'execute': 'x-blockdev-change', 'arguments': {'parent': 'colo-disk0', 'child': 'children.1'}} + { 'execute': 'human-monitor-command','arguments': {'command-line': 'drive_del blk-buddy0'}} + Note: there is no qmp command to remove the blockdev now + +Secondary: + The primary host is down, so we should do the following thing: + { 'execute': 'nbd-server-stop' } + +== TODO == +1. Support continuous VM replication. +2. Support shared storage. +3. Develop the heartbeat part. +4. Reduce checkpoint VM’s downtime while doing checkpoint. diff --git a/docs/atomics.txt b/docs/atomics.txt index c95950b6c5..3ef5d85b1b 100644 --- a/docs/atomics.txt +++ b/docs/atomics.txt @@ -15,7 +15,8 @@ Macros defined by qemu/atomic.h fall in three camps: - compiler barriers: barrier(); - weak atomic access and manual memory barriers: atomic_read(), - atomic_set(), smp_rmb(), smp_wmb(), smp_mb(), smp_read_barrier_depends(); + atomic_set(), smp_rmb(), smp_wmb(), smp_mb(), smp_mb_acquire(), + smp_mb_release(), smp_read_barrier_depends(); - sequentially consistent atomic access: everything else. @@ -111,8 +112,8 @@ consistent primitives. When using this model, variables are accessed with atomic_read() and atomic_set(), and restrictions to the ordering of accesses is enforced -using the smp_rmb(), smp_wmb(), smp_mb() and smp_read_barrier_depends() -memory barriers. +using the memory barrier macros: smp_rmb(), smp_wmb(), smp_mb(), +smp_mb_acquire(), smp_mb_release(), smp_read_barrier_depends(). atomic_read() and atomic_set() prevents the compiler from using optimizations that might otherwise optimize accesses out of existence @@ -124,7 +125,7 @@ other threads, and which are local to the current thread or protected by other, more mundane means. Memory barriers control the order of references to shared memory. -They come in four kinds: +They come in six kinds: - smp_rmb() guarantees that all the LOAD operations specified before the barrier will appear to happen before all the LOAD operations @@ -142,6 +143,16 @@ They come in four kinds: In other words, smp_wmb() puts a partial ordering on stores, but is not required to have any effect on loads. +- smp_mb_acquire() guarantees that all the LOAD operations specified before + the barrier will appear to happen before all the LOAD or STORE operations + specified after the barrier with respect to the other components of + the system. + +- smp_mb_release() guarantees that all the STORE operations specified *after* + the barrier will appear to happen after all the LOAD or STORE operations + specified *before* the barrier with respect to the other components of + the system. + - smp_mb() guarantees that all the LOAD and STORE operations specified before the barrier will appear to happen before all the LOAD and STORE operations specified after the barrier with respect to the other @@ -149,8 +160,9 @@ They come in four kinds: smp_mb() puts a partial ordering on both loads and stores. It is stronger than both a read and a write memory barrier; it implies both - smp_rmb() and smp_wmb(), but it also prevents STOREs coming before the - barrier from overtaking LOADs coming after the barrier and vice versa. + smp_mb_acquire() and smp_mb_release(), but it also prevents STOREs + coming before the barrier from overtaking LOADs coming after the + barrier and vice versa. - smp_read_barrier_depends() is a weaker kind of read barrier. On most processors, whenever two loads are performed such that the @@ -173,24 +185,21 @@ They come in four kinds: This is the set of barriers that is required *between* two atomic_read() and atomic_set() operations to achieve sequential consistency: - | 2nd operation | - |-----------------------------------------| - 1st operation | (after last) | atomic_read | atomic_set | - ---------------+--------------+-------------+------------| - (before first) | | none | smp_wmb() | - ---------------+--------------+-------------+------------| - atomic_read | smp_rmb() | smp_rmb()* | ** | - ---------------+--------------+-------------+------------| - atomic_set | none | smp_mb()*** | smp_wmb() | - ---------------+--------------+-------------+------------| + | 2nd operation | + |-----------------------------------------------| + 1st operation | (after last) | atomic_read | atomic_set | + ---------------+----------------+-------------+----------------| + (before first) | | none | smp_mb_release | + ---------------+----------------+-------------+----------------| + atomic_read | smp_mb_acquire | smp_rmb | ** | + ---------------+----------------+-------------+----------------| + atomic_set | none | smp_mb()*** | smp_wmb() | + ---------------+----------------+-------------+----------------| * Or smp_read_barrier_depends(). - ** This requires a load-store barrier. How to achieve this varies - depending on the machine, but in practice smp_rmb()+smp_wmb() - should have the desired effect. For example, on PowerPC the - lwsync instruction is a combined load-load, load-store and - store-store barrier. + ** This requires a load-store barrier. This is achieved by + either smp_mb_acquire() or smp_mb_release(). *** This requires a store-load barrier. On most machines, the only way to achieve this is a full barrier. @@ -199,11 +208,11 @@ and atomic_set() operations to achieve sequential consistency: You can see that the two possible definitions of atomic_mb_read() and atomic_mb_set() are the following: - 1) atomic_mb_read(p) = atomic_read(p); smp_rmb() - atomic_mb_set(p, v) = smp_wmb(); atomic_set(p, v); smp_mb() + 1) atomic_mb_read(p) = atomic_read(p); smp_mb_acquire() + atomic_mb_set(p, v) = smp_mb_release(); atomic_set(p, v); smp_mb() - 2) atomic_mb_read(p) = smp_mb() atomic_read(p); smp_rmb() - atomic_mb_set(p, v) = smp_wmb(); atomic_set(p, v); + 2) atomic_mb_read(p) = smp_mb() atomic_read(p); smp_mb_acquire() + atomic_mb_set(p, v) = smp_mb_release(); atomic_set(p, v); Usually the former is used, because smp_mb() is expensive and a program normally has more reads than writes. Therefore it makes more sense to @@ -222,7 +231,7 @@ place barriers instead: thread 1 thread 1 ------------------------- ------------------------ (other writes) - smp_wmb() + smp_mb_release() atomic_mb_set(&a, x) atomic_set(&a, x) smp_wmb() atomic_mb_set(&b, y) atomic_set(&b, y) @@ -233,7 +242,13 @@ place barriers instead: y = atomic_mb_read(&b) y = atomic_read(&b) smp_rmb() x = atomic_mb_read(&a) x = atomic_read(&a) - smp_rmb() + smp_mb_acquire() + + Note that the barrier between the stores in thread 1, and between + the loads in thread 2, has been optimized here to a write or a + read memory barrier respectively. On some architectures, notably + ARMv7, smp_mb_acquire and smp_mb_release are just as expensive as + smp_mb, but smp_rmb and/or smp_wmb are more efficient. - sometimes, a thread is accessing many variables that are otherwise unrelated to each other (for example because, apart from the current @@ -246,12 +261,12 @@ place barriers instead: n = 0; n = 0; for (i = 0; i < 10; i++) => for (i = 0; i < 10; i++) n += atomic_mb_read(&a[i]); n += atomic_read(&a[i]); - smp_rmb(); + smp_mb_acquire(); Similarly, atomic_mb_set() can be transformed as follows: smp_mb(): - smp_wmb(); + smp_mb_release(); for (i = 0; i < 10; i++) => for (i = 0; i < 10; i++) atomic_mb_set(&a[i], false); atomic_set(&a[i], false); smp_mb(); @@ -261,7 +276,7 @@ The two tricks can be combined. In this case, splitting a loop in two lets you hoist the barriers out of the loops _and_ eliminate the expensive smp_mb(): - smp_wmb(); + smp_mb_release(); for (i = 0; i < 10; i++) { => for (i = 0; i < 10; i++) atomic_mb_set(&a[i], false); atomic_set(&a[i], false); atomic_mb_set(&b[i], false); smb_wmb(); @@ -312,8 +327,8 @@ access and for data dependency barriers: smp_read_barrier_depends(); z = b[y]; -smp_wmb() also pairs with atomic_mb_read(), and smp_rmb() also pairs -with atomic_mb_set(). +smp_wmb() also pairs with atomic_mb_read() and smp_mb_acquire(). +and smp_rmb() also pairs with atomic_mb_set() and smp_mb_release(). COMPARISON WITH LINUX KERNEL MEMORY BARRIERS @@ -359,8 +374,9 @@ and memory barriers, and the equivalents in QEMU: note that smp_store_mb() is a little weaker than atomic_mb_set(). atomic_mb_read() compiles to the same instructions as Linux's smp_load_acquire(), but this should be treated as an implementation - detail. If required, QEMU might later add atomic_load_acquire() and - atomic_store_release() macros. + detail. QEMU does have atomic_load_acquire() and atomic_store_release() + macros, but for now they are only used within atomic.h. This may + change in the future. SOURCES diff --git a/docs/block-replication.txt b/docs/block-replication.txt new file mode 100644 index 0000000000..6bde6737fb --- /dev/null +++ b/docs/block-replication.txt @@ -0,0 +1,239 @@ +Block replication +---------------------------------------- +Copyright Fujitsu, Corp. 2016 +Copyright (c) 2016 Intel Corporation +Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +Block replication is used for continuous checkpoints. It is designed +for COLO (COarse-grain LOck-stepping) where the Secondary VM is running. +It can also be applied for FT/HA (Fault-tolerance/High Assurance) scenario, +where the Secondary VM is not running. + +This document gives an overview of block replication's design. + +== Background == +High availability solutions such as micro checkpoint and COLO will do +consecutive checkpoints. The VM state of the Primary and Secondary VM is +identical right after a VM checkpoint, but becomes different as the VM +executes till the next checkpoint. To support disk contents checkpoint, +the modified disk contents in the Secondary VM must be buffered, and are +only dropped at next checkpoint time. To reduce the network transportation +effort during a vmstate checkpoint, the disk modification operations of +the Primary disk are asynchronously forwarded to the Secondary node. + +== Workflow == +The following is the image of block replication workflow: + + +----------------------+ +------------------------+ + |Primary Write Requests| |Secondary Write Requests| + +----------------------+ +------------------------+ + | | + | (4) + | V + | /-------------\ + | Copy and Forward | | + |---------(1)----------+ | Disk Buffer | + | | | | + | (3) \-------------/ + | speculative ^ + | write through (2) + | | | + V V | + +--------------+ +----------------+ + | Primary Disk | | Secondary Disk | + +--------------+ +----------------+ + + 1) Primary write requests will be copied and forwarded to Secondary + QEMU. + 2) Before Primary write requests are written to Secondary disk, the + original sector content will be read from Secondary disk and + buffered in the Disk buffer, but it will not overwrite the existing + sector content (it could be from either "Secondary Write Requests" or + previous COW of "Primary Write Requests") in the Disk buffer. + 3) Primary write requests will be written to Secondary disk. + 4) Secondary write requests will be buffered in the Disk buffer and it + will overwrite the existing sector content in the buffer. + +== Architecture == +We are going to implement block replication from many basic +blocks that are already in QEMU. + + virtio-blk || + ^ || .---------- + | || | Secondary + 1 Quorum || '---------- + / \ || + / \ || + Primary 2 filter + disk ^ virtio-blk + | ^ + 3 NBD -------> 3 NBD | + client || server 2 filter + || ^ ^ +--------. || | | +Primary | || Secondary disk <--------- hidden-disk 5 <--------- active-disk 4 +--------' || | backing ^ backing + || | | + || | | + || '-------------------------' + || drive-backup sync=none 6 + +1) The disk on the primary is represented by a block device with two +children, providing replication between a primary disk and the host that +runs the secondary VM. The read pattern (fifo) for quorum can be extended +to make the primary always read from the local disk instead of going through +NBD. + +2) The new block filter (the name is replication) will control the block +replication. + +3) The secondary disk receives writes from the primary VM through QEMU's +embedded NBD server (speculative write-through). + +4) The disk on the secondary is represented by a custom block device +(called active-disk). It should start as an empty disk, and the format +should support bdrv_make_empty() and backing file. + +5) The hidden-disk is created automatically. It buffers the original content +that is modified by the primary VM. It should also start as an empty disk, +and the driver supports bdrv_make_empty() and backing file. + +6) The drive-backup job (sync=none) is run to allow hidden-disk to buffer +any state that would otherwise be lost by the speculative write-through +of the NBD server into the secondary disk. So before block replication, +the primary disk and secondary disk should contain the same data. + +== Failure Handling == +There are 7 internal errors when block replication is running: +1. I/O error on primary disk +2. Forwarding primary write requests failed +3. Backup failed +4. I/O error on secondary disk +5. I/O error on active disk +6. Making active disk or hidden disk empty failed +7. Doing failover failed +In case 1 and 5, we just report the error to the disk layer. In case 2, 3, +4 and 6, we just report block replication's error to FT/HA manager (which +decides when to do a new checkpoint, when to do failover). +In case 7, if active commit failed, we use replication failover failed state +in Secondary's write operation (what decides which target to write). + +== New block driver interface == +We add four block driver interfaces to control block replication: +a. replication_start_all() + Start block replication, called in migration/checkpoint thread. + We must call block_replication_start_all() in secondary QEMU before + calling block_replication_start_all() in primary QEMU. The caller + must hold the I/O mutex lock if it is in migration/checkpoint + thread. +b. replication_do_checkpoint_all() + This interface is called after all VM state is transferred to + Secondary QEMU. The Disk buffer will be dropped in this interface. + The caller must hold the I/O mutex lock if it is in migration/checkpoint + thread. +c. replication_get_error_all() + This interface is called to check if error happened in replication. + The caller must hold the I/O mutex lock if it is in migration/checkpoint + thread. +d. replication_stop_all() + It is called on failover. We will flush the Disk buffer into + Secondary Disk and stop block replication. The vm should be stopped + before calling it if you use this API to shutdown the guest, or other + things except failover. The caller must hold the I/O mutex lock if it is + in migration/checkpoint thread. + +== Usage == +Primary: + -drive if=xxx,driver=quorum,read-pattern=fifo,id=colo1,vote-threshold=1,\ + children.0.file.filename=1.raw,\ + children.0.driver=raw + + Run qmp command in primary qemu: + { 'execute': 'human-monitor-command', + 'arguments': { + 'command-line': 'drive_add -n buddy driver=replication,mode=primary,file.driver=nbd,file.host=xxxx,file.port=xxxx,file.export=colo1,node-name=nbd_client1' + } + } + { 'execute': 'x-blockdev-change', + 'arguments': { + 'parent': 'colo1', + 'node': 'nbd_client1' + } + } + Note: + 1. There should be only one NBD Client for each primary disk. + 2. host is the secondary physical machine's hostname or IP + 3. Each disk must have its own export name. + 4. It is all a single argument to -drive and you should ignore the + leading whitespace. + 5. The qmp command line must be run after running qmp command line in + secondary qemu. + 6. After failover we need remove children.1 (replication driver). + +Secondary: + -drive if=none,driver=raw,file.filename=1.raw,id=colo1 \ + -drive if=xxx,id=topxxx,driver=replication,mode=secondary,top-id=topxxx\ + file.file.filename=active_disk.qcow2,\ + file.driver=qcow2,\ + file.backing.file.filename=hidden_disk.qcow2,\ + file.backing.driver=qcow2,\ + file.backing.backing=colo1 + + Then run qmp command in secondary qemu: + { 'execute': 'nbd-server-start', + 'arguments': { + 'addr': { + 'type': 'inet', + 'data': { + 'host': 'xxx', + 'port': 'xxx' + } + } + } + } + { 'execute': 'nbd-server-add', + 'arguments': { + 'device': 'colo1', + 'writable': true + } + } + + Note: + 1. The export name in secondary QEMU command line is the secondary + disk's id. + 2. The export name for the same disk must be the same + 3. The qmp command nbd-server-start and nbd-server-add must be run + before running the qmp command migrate on primary QEMU + 4. Active disk, hidden disk and nbd target's length should be the + same. + 5. It is better to put active disk and hidden disk in ramdisk. + 6. It is all a single argument to -drive, and you should ignore + the leading whitespace. + +After Failover: +Primary: + The secondary host is down, so we should run the following qmp command + to remove the nbd child from the quorum: + { 'execute': 'x-blockdev-change', + 'arguments': { + 'parent': 'colo1', + 'child': 'children.1' + } + } + { 'execute': 'human-monitor-command', + 'arguments': { + 'command-line': 'drive_del xxxx' + } + } + Note: there is no qmp command to remove the blockdev now + +Secondary: + The primary host is down, so we should do the following thing: + { 'execute': 'nbd-server-stop' } + +TODO: +1. Continuous block replication +2. Shared disk diff --git a/docs/colo-proxy.txt b/docs/colo-proxy.txt new file mode 100644 index 0000000000..76767cb34f --- /dev/null +++ b/docs/colo-proxy.txt @@ -0,0 +1,188 @@ +COLO-proxy +---------- +Copyright (c) 2016 Intel Corporation +Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. +Copyright (c) 2016 Fujitsu, Corp. + +This work is licensed under the terms of the GNU GPL, version 2 or later. +See the COPYING file in the top-level directory. + +This document gives an overview of COLO proxy's design. + +== Background == +COLO-proxy is a part of COLO project. It is used +to compare the network package to help COLO decide +whether to do checkpoint. With COLO-proxy's help, +COLO greatly improves the performance. + +The filter-redirector, filter-mirror, colo-compare +and filter-rewriter compose the COLO-proxy. + +== Architecture == + +COLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter +(except colo-compare). It keep Secondary VM connect normally to +client and compare packets sent by PVM with sent by SVM. +If the packet difference, notify COLO-frame to do checkpoint and send +all primary packet has queued. Otherwise just send the queued primary +packet and drop the queued secondary packet. + +Below is a COLO proxy ascii figure: + + Primary qemu Secondary qemu ++--------------------------------------------------------------+ +----------------------------------------------------------------+ +| +----------------------------------------------------------+ | | +-----------------------------------------------------------+ | +| | | | | | | | +| | guest | | | | guest | | +| | | | | | | | +| +-------^--------------------------+-----------------------+ | | +---------------------+--------+----------------------------+ | +| | | | | ^ | | +| | | | | | | | +| | +------------------------------------------------------+ | | | | +|netfilter| | | | | | netfilter | | | +| +----------+ +----------------------------+ | | | +-----------------------------------------------------------+ | +| | | | | | out | | | | | | filter excute order | | +| | | | +-----------------------------+ | | | | | | +-------------------> | | +| | | | | | | | | | | | | | TCP | | +| | +-----+--+-+ +-----v----+ +-----v----+ |pri +----+----+sec| | | | +------------+ +---+----+---v+rewriter++ +------------+ | | +| | | | | | | | |in | |in | | | | | | | | | | | | | +| | | filter | | filter | | filter +------> colo <------+ +--------> filter +--> adjust | adjust +--> filter | | | +| | | mirror | |redirector| |redirector| | | compare | | | | | | redirector | | ack | seq | | redirector | | | +| | | | | | | | | | | | | | | | | | | | | | | | +| | +----^-----+ +----+-----+ +----------+ | +---------+ | | | | +------------+ +--------+--------------+ +---+--------+ | | +| | | tx | rx rx | | | | | tx all | rx | | +| | | | | | | | +-----------------------------------------------------------+ | +| | | +--------------+ | | | | | | +| | | filter excute order | | | | | | | +| | | +----------------> | | | +--------------------------------------------------------+ | +| +-----------------------------------------+ | | | +| | | | | | ++--------------------------------------------------------------+ +----------------------------------------------------------------+ + |guest receive | guest send + | | ++--------+----------------------------v------------------------+ +| | NOTE: filter direction is rx/tx/all +| tap | rx:receive packets sent to the netdev +| | tx:receive packets sent by the netdev ++--------------------------------------------------------------+ + +1.Guest receive packet route: + +Primary: + +Tap --> Mirror Client Filter +Mirror client will send packet to guest,at the +same time, copy and forward packet to secondary +mirror server. + +Secondary: + +Mirror Server Filter --> TCP Rewriter +If receive packet is TCP packet,we will adjust ack +and update TCP checksum, then send to secondary +guest. Otherwise directly send to guest. + +2.Guest send packet route: + +Primary: + +Guest --> Redirect Server Filter +Redirect server filter receive primary guest packet +but do nothing, just pass to next filter. + +Redirect Server Filter --> COLO-Compare +COLO-compare receive primary guest packet then +waiting scondary redirect packet to compare it. +If packet same,send queued primary packet and clear +queued secondary packet, Otherwise send primary packet +and do checkpoint. + +COLO-Compare --> Another Redirector Filter +The redirector get packet from colo-compare by use +chardev socket. + +Redirector Filter --> Tap +Send the packet. + +Secondary: + +Guest --> TCP Rewriter Filter +If the packet is TCP packet,we will adjust seq +and update TCP checksum. Then send it to +redirect client filter. Otherwise directly send to +redirect client filter. + +Redirect Client Filter --> Redirect Server Filter +Forward packet to primary. + +== Components introduction == + +Filter-mirror is a netfilter plugin. +It gives qemu the ability to mirror +packets to a chardev. + +Filter-redirector is a netfilter plugin. +It gives qemu the ability to redirect net packet. +Redirector can redirect filter's net packet to outdev, +and redirect indev's packet to filter. + + filter + + + redirector | + +--------------+ + | | | + | | | + | | | + indev +---------+ +----------> outdev + | | | + | | | + | | | + +--------------+ + | + v + filter + +COLO-compare, we do packet comparing job. +Packets coming from the primary char indev will be sent to outdev. +Packets coming from the secondary char dev will be dropped after comparing. +COLO-comapre need two input chardev and one output chardev: +primary_in=chardev1-id (source: primary send packet) +secondary_in=chardev2-id (source: secondary send packet) +outdev=chardev3-id + +Filter-rewriter will rewrite some of secondary packet to make +secondary guest's tcp connection established successfully. +In this module we will rewrite tcp packet's ack to the secondary +from primary,and rewrite tcp packet's seq to the primary from +secondary. + +== Usage == + +Here, we use demo ip and port discribe more clearly. +Primary(ip:3.3.3.3): +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown +-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 +-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait +-chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait +-chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait +-chardev socket,id=compare0-0,host=3.3.3.3,port=9001 +-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait +-chardev socket,id=compare_out0,host=3.3.3.3,port=9005 +-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 +-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out +-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 +-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0 + +Secondary(ip:3.3.3.8): +-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown +-device e1000,netdev=hn0,mac=52:a4:00:12:78:66 +-chardev socket,id=red0,host=3.3.3.3,port=9003 +-chardev socket,id=red1,host=3.3.3.3,port=9004 +-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 +-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 + +Note: + a.COLO-proxy must work with COLO-frame and Block-replication. + b.Primary COLO must be started firstly, because COLO-proxy needs + chardev socket server running before secondary started. + c.Filter-rewriter only rewrite tcp packet. diff --git a/docs/generic-loader.txt b/docs/generic-loader.txt new file mode 100644 index 0000000000..31bbcd42f6 --- /dev/null +++ b/docs/generic-loader.txt @@ -0,0 +1,92 @@ +Copyright (c) 2016 Xilinx Inc. + +This work is licensed under the terms of the GNU GPL, version 2 or later. See +the COPYING file in the top-level directory. + + +The 'loader' device allows the user to load multiple images or values into +QEMU at startup. + +Loading Data into Memory Values +------------------------------- +The loader device allows memory values to be set from the command line. This +can be done by following the syntax below: + + -device loader,addr=<addr>,data=<data>,data-len=<data-len> + [,data-be=<data-be>][,cpu-num=<cpu-num>] + + <addr> - The address to store the data in. + <data> - The value to be written to the address. The maximum size of + the data is 8 bytes. + <data-len> - The length of the data in bytes. This argument must be + included if the data argument is. + <data-be> - Set to true if the data to be stored on the guest should be + written as big endian data. The default is to write little + endian data. + <cpu-num> - The number of the CPU's address space where the data should + be loaded. If not specified the address space of the first + CPU is used. + +All values are parsed using the standard QemuOps parsing. This allows the user +to specify any values in any format supported. By default the values +will be parsed as decimal. To use hex values the user should prefix the number +with a '0x'. + +An example of loading value 0x8000000e to address 0xfd1a0104 is: + -device loader,addr=0xfd1a0104,data=0x8000000e,data-len=4 + +Setting a CPU's Program Counter +------------------------------- +The loader device allows the CPU's PC to be set from the command line. This +can be done by following the syntax below: + + -device loader,addr=<addr>,cpu-num=<cpu-num> + + <addr> - The value to use as the CPU's PC. + <cpu-num> - The number of the CPU whose PC should be set to the + specified value. + +All values are parsed using the standard QemuOps parsing. This allows the user +to specify any values in any format supported. By default the values +will be parsed as decimal. To use hex values the user should prefix the number +with a '0x'. + +An example of setting CPU 0's PC to 0x8000 is: + -device loader,addr=0x8000,cpu-num=0 + +Loading Files +------------- +The loader device also allows files to be loaded into memory. It can load raw +files and ELF executable files. Raw files are loaded verbatim. ELF executable +files are loaded by an ELF loader. The syntax is shown below: + + -device loader,file=<file>[,addr=<addr>][,cpu-num=<cpu-num>][,force-raw=<raw>] + + <file> - A file to be loaded into memory + <addr> - The addr in memory that the file should be loaded. This is + ignored if you are using an ELF (unless force-raw is true). + This is required if you aren't loading an ELF. + <cpu-num> - This specifies the CPU that should be used. This is an + optional argument and will cause the CPU's PC to be set to + where the image is stored or in the case of an ELF file to + the value in the header. This option should only be used + for the boot image. + This will also cause the image to be written to the specified + CPU's address space. If not specified, the default is CPU 0. + <force-raw> - Setting force-raw=on forces the file to be treated as a raw + image. This can be used to load ELF files as if they were raw. + +All values are parsed using the standard QemuOps parsing. This allows the user +to specify any values in any format supported. By default the values +will be parsed as decimal. To use hex values the user should prefix the number +with a '0x'. + +An example of loading an ELF file which CPU0 will boot is shown below: + -device loader,file=./images/boot.elf,cpu-num=0 + +Restrictions and ToDos +---------------------- + - At the moment it is just assumed that if you specify a cpu-num then you + want to set the PC as well. This might not always be the case. In future + the internal state 'set_pc' (which exists in the generic loader now) should + be exposed to the user so that they can choose if the PC is set or not. diff --git a/docs/live-block-ops.txt b/docs/live-block-ops.txt index a257087401..2211d14428 100644 --- a/docs/live-block-ops.txt +++ b/docs/live-block-ops.txt @@ -4,15 +4,20 @@ LIVE BLOCK OPERATIONS High level description of live block operations. Note these are not supported for use with the raw format at the moment. +Note also that this document is incomplete and it currently only +covers the 'stream' operation. Other operations supported by QEMU such +as 'commit', 'mirror' and 'backup' are not described here yet. Please +refer to the qapi/block-core.json file for an overview of those. + Snapshot live merge =================== Given a snapshot chain, described in this document in the following format: -[A] -> [B] -> [C] -> [D] +[A] <- [B] <- [C] <- [D] <- [E] -Where the rightmost object ([D] in the example) described is the current +Where the rightmost object ([E] in the example) described is the current image which the guest OS has write access to. To the left of it is its base image, and so on accordingly until the leftmost image, which has no base. @@ -21,11 +26,14 @@ The snapshot live merge operation transforms such a chain into a smaller one with fewer elements, such as this transformation relative to the first example: -[A] -> [D] +[A] <- [E] + +Data is copied in the right direction with destination being the +rightmost image, but any other intermediate image can be specified +instead. In this example data is copied from [C] into [D], so [D] can +be backed by [B]: -Currently only forward merge with target being the active image is -supported, that is, data copy is performed in the right direction with -destination being the rightmost image. +[A] <- [B] <- [D] <- [E] The operation is implemented in QEMU through image streaming facilities. @@ -35,14 +43,20 @@ streaming operation completes it raises a QMP event. 'block_stream' copies data from the backing file(s) into the active image. When finished, it adjusts the backing file pointer. -The 'base' parameter specifies an image which data need not be streamed from. -This image will be used as the backing file for the active image when the -operation is finished. +The 'base' parameter specifies an image which data need not be +streamed from. This image will be used as the backing file for the +destination image when the operation is finished. + +In the first example above, the command would be: + +(qemu) block_stream virtio0 file-A.img -In the example above, the command would be: +In order to specify a destination image different from the active +(rightmost) one we can use its node name instead. -(qemu) block_stream virtio0 A +In the second example above, the command would be: +(qemu) block_stream node-D file-B.img Live block copy =============== diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt index 40b8419916..0e7cdb2c28 100644 --- a/docs/multiple-iothreads.txt +++ b/docs/multiple-iothreads.txt @@ -105,13 +105,10 @@ a BH in the target AioContext beforehand and then call qemu_bh_schedule(). No acquire/release or locking is needed for the qemu_bh_schedule() call. But be sure to acquire the AioContext for aio_bh_new() if necessary. -The relationship between AioContext and the block layer -------------------------------------------------------- -The AioContext originates from the QEMU block layer because it provides a -scoped way of running event loop iterations until all work is done. This -feature is used to complete all in-flight block I/O requests (see -bdrv_drain_all()). Nowadays AioContext is a generic event loop that can be -used by any QEMU subsystem. +AioContext and the block layer +------------------------------ +The AioContext originates from the QEMU block layer, even though nowadays +AioContext is a generic event loop that can be used by any QEMU subsystem. The block layer has support for AioContext integrated. Each BlockDriverState is associated with an AioContext using bdrv_set_aio_context() and @@ -122,13 +119,22 @@ Block layer code must therefore expect to run in an IOThread and avoid using old APIs that implicitly use the main loop. See the "How to program for IOThreads" above for information on how to do that. -If main loop code such as a QMP function wishes to access a BlockDriverState it -must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure the -IOThread does not run in parallel. - -Long-running jobs (usually in the form of coroutines) are best scheduled in the -BlockDriverState's AioContext to avoid the need to acquire/release around each -bdrv_*() call. Be aware that there is currently no mechanism to get notified -when bdrv_set_aio_context() moves this BlockDriverState to a different -AioContext (see bdrv_detach_aio_context()/bdrv_attach_aio_context()), so you -may need to add this if you want to support long-running jobs. +If main loop code such as a QMP function wishes to access a BlockDriverState +it must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure +that callbacks in the IOThread do not run in parallel. + +Code running in the monitor typically needs to ensure that past +requests from the guest are completed. When a block device is running +in an IOThread, the IOThread can also process requests from the guest +(via ioeventfd). To achieve both objects, wrap the code between +bdrv_drained_begin() and bdrv_drained_end(), thus creating a "drained +section". The functions must be called between aio_context_acquire() +and aio_context_release(). You can freely release and re-acquire the +AioContext within a drained section. + +Long-running jobs (usually in the form of coroutines) are best scheduled in +the BlockDriverState's AioContext to avoid the need to acquire/release around +each bdrv_*() call. The functions bdrv_add/remove_aio_context_notifier, +or alternatively blk_add/remove_aio_context_notifier if you use BlockBackends, +can be used to get a notification whenever bdrv_set_aio_context() moves a +BlockDriverState to a different AioContext. diff --git a/docs/pcie.txt b/docs/pcie.txt new file mode 100644 index 0000000000..9fb20aaed9 --- /dev/null +++ b/docs/pcie.txt @@ -0,0 +1,310 @@ +PCI EXPRESS GUIDELINES +====================== + +1. Introduction +================ +The doc proposes best practices on how to use PCI Express/PCI device +in PCI Express based machines and explains the reasoning behind them. + +The following presentations accompany this document: + (1) Q35 overview. + http://wiki.qemu.org/images/4/4e/Q35.pdf + (2) A comparison between PCI and PCI Express technologies. + http://wiki.qemu.org/images/f/f6/PCIvsPCIe.pdf + +Note: The usage examples are not intended to replace the full +documentation, please use QEMU help to retrieve all options. + +2. Device placement strategy +============================ +QEMU does not have a clear socket-device matching mechanism +and allows any PCI/PCI Express device to be plugged into any +PCI/PCI Express slot. +Plugging a PCI device into a PCI Express slot might not always work and +is weird anyway since it cannot be done for "bare metal". +Plugging a PCI Express device into a PCI slot will hide the Extended +Configuration Space thus is also not recommended. + +The recommendation is to separate the PCI Express and PCI hierarchies. +PCI Express devices should be plugged only into PCI Express Root Ports and +PCI Express Downstream ports. + +2.1 Root Bus (pcie.0) +===================== +Place only the following kinds of devices directly on the Root Complex: + (1) PCI Devices (e.g. network card, graphics card, IDE controller), + not controllers. Place only legacy PCI devices on + the Root Complex. These will be considered Integrated Endpoints. + Note: Integrated Endpoints are not hot-pluggable. + + Although the PCI Express spec does not forbid PCI Express devices as + Integrated Endpoints, existing hardware mostly integrates legacy PCI + devices with the Root Complex. Guest OSes are suspected to behave + strangely when PCI Express devices are integrated + with the Root Complex. + + (2) PCI Express Root Ports (ioh3420), for starting exclusively PCI Express + hierarchies. + + (3) DMI-PCI Bridges (i82801b11-bridge), for starting legacy PCI + hierarchies. + + (4) Extra Root Complexes (pxb-pcie), if multiple PCI Express Root Buses + are needed. + + pcie.0 bus + ---------------------------------------------------------------------------- + | | | | + ----------- ------------------ ------------------ -------------- + | PCI Dev | | PCIe Root Port | | DMI-PCI Bridge | | pxb-pcie | + ----------- ------------------ ------------------ -------------- + +2.1.1 To plug a device into pcie.0 as a Root Complex Integrated Endpoint use: + -device <dev>[,bus=pcie.0] +2.1.2 To expose a new PCI Express Root Bus use: + -device pxb-pcie,id=pcie.1,bus_nr=x[,numa_node=y][,addr=z] + Only PCI Express Root Ports and DMI-PCI bridges can be connected + to the pcie.1 bus: + -device ioh3420,id=root_port1[,bus=pcie.1][,chassis=x][,slot=y][,addr=z] \ + -device i82801b11-bridge,id=dmi_pci_bridge1,bus=pcie.1 + + +2.2 PCI Express only hierarchy +============================== +Always use PCI Express Root Ports to start PCI Express hierarchies. + +A PCI Express Root bus supports up to 32 devices. Since each +PCI Express Root Port is a function and a multi-function +device may support up to 8 functions, the maximum possible +number of PCI Express Root Ports per PCI Express Root Bus is 256. + +Prefer grouping PCI Express Root Ports into multi-function devices +to keep a simple flat hierarchy that is enough for most scenarios. +Only use PCI Express Switches (x3130-upstream, xio3130-downstream) +if there is no more room for PCI Express Root Ports. +Please see section 4. for further justifications. + +Plug only PCI Express devices into PCI Express Ports. + + + pcie.0 bus + ---------------------------------------------------------------------------------- + | | | + ------------- ------------- ------------- + | Root Port | | Root Port | | Root Port | + ------------ ------------- ------------- + | -------------------------|------------------------ + ------------ | ----------------- | + | PCIe Dev | | PCI Express | Upstream Port | | + ------------ | Switch ----------------- | + | | | | + | ------------------- ------------------- | + | | Downstream Port | | Downstream Port | | + | ------------------- ------------------- | + -------------|-----------------------|------------ + ------------ + | PCIe Dev | + ------------ + +2.2.1 Plugging a PCI Express device into a PCI Express Root Port: + -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ + -device <dev>,bus=root_port1 +2.2.2 Using multi-function PCI Express Root Ports: + -device ioh3420,id=root_port1,multifunction=on,chassis=x,slot=y[,bus=pcie.0][,addr=z.0] \ + -device ioh3420,id=root_port2,chassis=x1,slot=y1[,bus=pcie.0][,addr=z.1] \ + -device ioh3420,id=root_port3,chassis=x2,slot=y2[,bus=pcie.0][,addr=z.2] \ +2.2.2 Plugging a PCI Express device into a Switch: + -device ioh3420,id=root_port1,chassis=x,slot=y[,bus=pcie.0][,addr=z] \ + -device x3130-upstream,id=upstream_port1,bus=root_port1[,addr=x] \ + -device xio3130-downstream,id=downstream_port1,bus=upstream_port1,chassis=x1,slot=y1[,addr=z1]] \ + -device <dev>,bus=downstream_port1 + +Notes: + - (slot, chassis) pair is mandatory and must be + unique for each PCI Express Root Port. + - 'addr' parameter can be 0 for all the examples above. + + +2.3 PCI only hierarchy +====================== +Legacy PCI devices can be plugged into pcie.0 as Integrated Endpoints, +but, as mentioned in section 5, doing so means the legacy PCI +device in question will be incapable of hot-unplugging. +Besides that use DMI-PCI Bridges (i82801b11-bridge) in combination +with PCI-PCI Bridges (pci-bridge) to start PCI hierarchies. + +Prefer flat hierarchies. For most scenarios a single DMI-PCI Bridge +(having 32 slots) and several PCI-PCI Bridges attached to it +(each supporting also 32 slots) will support hundreds of legacy devices. +The recommendation is to populate one PCI-PCI Bridge under the DMI-PCI Bridge +until is full and then plug a new PCI-PCI Bridge... + + pcie.0 bus + ---------------------------------------------- + | | + ----------- ------------------ + | PCI Dev | | DMI-PCI BRIDGE | + ---------- ------------------ + | | + ------------------ ------------------ + | PCI-PCI Bridge | | PCI-PCI Bridge | ... + ------------------ ------------------ + | | + ----------- ----------- + | PCI Dev | | PCI Dev | + ----------- ----------- + +2.3.1 To plug a PCI device into pcie.0 as an Integrated Endpoint use: + -device <dev>[,bus=pcie.0] +2.3.2 Plugging a PCI device into a PCI-PCI Bridge: + -device i82801b11-bridge,id=dmi_pci_bridge1[,bus=pcie.0] \ + -device pci-bridge,id=pci_bridge1,bus=dmi_pci_bridge1[,chassis_nr=x][,addr=y] \ + -device <dev>,bus=pci_bridge1[,addr=x] + Note that 'addr' cannot be 0 unless shpc=off parameter is passed to + the PCI Bridge. + +3. IO space issues +=================== +The PCI Express Root Ports and PCI Express Downstream ports are seen by +Firmware/Guest OS as PCI-PCI Bridges. As required by the PCI spec, each +such Port should be reserved a 4K IO range for, even though only one +(multifunction) device can be plugged into each Port. This results in +poor IO space utilization. + +The firmware used by QEMU (SeaBIOS/OVMF) may try further optimizations +by not allocating IO space for each PCI Express Root / PCI Express +Downstream port if: + (1) the port is empty, or + (2) the device behind the port has no IO BARs. + +The IO space is very limited, to 65536 byte-wide IO ports, and may even be +fragmented by fixed IO ports owned by platform devices resulting in at most +10 PCI Express Root Ports or PCI Express Downstream Ports per system +if devices with IO BARs are used in the PCI Express hierarchy. Using the +proposed device placing strategy solves this issue by using only +PCI Express devices within PCI Express hierarchy. + +The PCI Express spec requires that PCI Express devices work properly +without using IO ports. The PCI hierarchy has no such limitations. + + +4. Bus numbers issues +====================== +Each PCI domain can have up to only 256 buses and the QEMU PCI Express +machines do not support multiple PCI domains even if extra Root +Complexes (pxb-pcie) are used. + +Each element of the PCI Express hierarchy (Root Complexes, +PCI Express Root Ports, PCI Express Downstream/Upstream ports) +uses one bus number. Since only one (multifunction) device +can be attached to a PCI Express Root Port or PCI Express Downstream +Port it is advised to plan in advance for the expected number of +devices to prevent bus number starvation. + +Avoiding PCI Express Switches (and thereby striving for a 'flatter' PCI +Express hierarchy) enables the hierarchy to not spend bus numbers on +Upstream Ports. + +The bus_nr properties of the pxb-pcie devices partition the 0..255 bus +number space. All bus numbers assigned to the buses recursively behind a +given pxb-pcie device's root bus must fit between the bus_nr property of +that pxb-pcie device, and the lowest of the higher bus_nr properties +that the command line sets for other pxb-pcie devices. + + +5. Hot-plug +============ +The PCI Express root buses (pcie.0 and the buses exposed by pxb-pcie devices) +do not support hot-plug, so any devices plugged into Root Complexes +cannot be hot-plugged/hot-unplugged: + (1) PCI Express Integrated Endpoints + (2) PCI Express Root Ports + (3) DMI-PCI Bridges + (4) pxb-pcie + +Be aware that PCI Express Downstream Ports can't be hot-plugged into +an existing PCI Express Upstream Port. + +PCI devices can be hot-plugged into PCI-PCI Bridges. The PCI hot-plug is ACPI +based and can work side by side with the PCI Express native hot-plug. + +PCI Express devices can be natively hot-plugged/hot-unplugged into/from +PCI Express Root Ports (and PCI Express Downstream Ports). + +5.1 Planning for hot-plug: + (1) PCI hierarchy + Leave enough PCI-PCI Bridge slots empty or add one + or more empty PCI-PCI Bridges to the DMI-PCI Bridge. + + For each such PCI-PCI Bridge the Guest Firmware is expected to reserve + 4K IO space and 2M MMIO range to be used for all devices behind it. + + Because of the hard IO limit of around 10 PCI Bridges (~ 40K space) + per system don't use more than 9 PCI-PCI Bridges, leaving 4K for the + Integrated Endpoints. (The PCI Express Hierarchy needs no IO space). + + (2) PCI Express hierarchy: + Leave enough PCI Express Root Ports empty. Use multifunction + PCI Express Root Ports (up to 8 ports per pcie.0 slot) + on the Root Complex(es), for keeping the + hierarchy as flat as possible, thereby saving PCI bus numbers. + Don't use PCI Express Switches if you don't have + to, each one of those uses an extra PCI bus (for its Upstream Port) + that could be put to better use with another Root Port or Downstream + Port, which may come handy for hot-plugging another device. + + +5.3 Hot-plug example: +Using HMP: (add -monitor stdio to QEMU command line) + device_add <dev>,id=<id>,bus=<PCI Express Root Port Id/PCI Express Downstream Port Id/PCI-PCI Bridge Id/> + + +6. Device assignment +==================== +Host devices are mostly PCI Express and should be plugged only into +PCI Express Root Ports or PCI Express Downstream Ports. +PCI-PCI Bridge slots can be used for legacy PCI host devices. + +6.1 How to detect if a device is PCI Express: + > lspci -s 03:00.0 -v (as root) + + 03:00.0 Network controller: Intel Corporation Wireless 7260 (rev 83) + Subsystem: Intel Corporation Dual Band Wireless-AC 7260 + Flags: bus master, fast devsel, latency 0, IRQ 50 + Memory at f0400000 (64-bit, non-prefetchable) [size=8K] + Capabilities: [c8] Power Management version 3 + Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ + Capabilities: [40] Express Endpoint, MSI 00 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + Capabilities: [100] Advanced Error Reporting + Capabilities: [140] Device Serial Number 7c-7a-91-ff-ff-90-db-20 + Capabilities: [14c] Latency Tolerance Reporting + Capabilities: [154] Vendor Specific Information: ID=cafe Rev=1 Len=014 + +If you can see the "Express Endpoint" capability in the +output, then the device is indeed PCI Express. + + +7. Virtio devices +================= +Virtio devices plugged into the PCI hierarchy or as Integrated Endpoints +will remain PCI and have transitional behaviour as default. +Transitional virtio devices work in both IO and MMIO modes depending on +the guest support. The Guest firmware will assign both IO and MMIO resources +to transitional virtio devices. + +Virtio devices plugged into PCI Express ports are PCI Express devices and +have "1.0" behavior by default without IO support. +In both cases disable-legacy and disable-modern properties can be used +to override the behaviour. + +Note that setting disable-legacy=off will enable legacy mode (enabling +legacy behavior) for PCI Express virtio devices causing them to +require IO space, which, given the limited available IO space, may quickly +lead to resource exhaustion, and is therefore strongly discouraged. + + +8. Conclusion +============== +The proposal offers a usage model that is easy to understand and follow +and at the same time overcomes the PCI Express architecture limitations. diff --git a/docs/qapi-code-gen.txt b/docs/qapi-code-gen.txt index de298dcaec..2841c5144a 100644 --- a/docs/qapi-code-gen.txt +++ b/docs/qapi-code-gen.txt @@ -964,9 +964,9 @@ Example: Used to generate the marshaling/dispatch functions for the commands defined in the schema. The generated code implements -qmp_marshal_COMMAND() (mentioned in qmp-commands.hx, and registered -automatically), and declares qmp_COMMAND() that the user must -implement. The following files are generated: +qmp_marshal_COMMAND() (registered automatically), and declares +qmp_COMMAND() that the user must implement. The following files are +generated: $(prefix)qmp-marshal.c: command marshal/dispatch functions for each QMP command defined in the schema. Functions @@ -1005,7 +1005,7 @@ Example: Error *err = NULL; Visitor *v; - v = qmp_output_visitor_new(ret_out); + v = qobject_output_visitor_new(ret_out); visit_type_UserDefOne(v, "unused", &ret_in, &err); if (!err) { visit_complete(v, ret_out); @@ -1024,7 +1024,7 @@ Example: Visitor *v; UserDefOneList *arg1 = NULL; - v = qmp_input_visitor_new(QOBJECT(args), true); + v = qobject_input_visitor_new(QOBJECT(args), true); visit_start_struct(v, NULL, NULL, 0, &err); if (err) { goto out; diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 5bb06072d3..1fdd6f9ce7 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -160,5 +160,6 @@ If unset, the default value for this parameter is 0 and it disables this feature. Note that this functionality currently relies on the MADV_DONTNEED -argument for madvise() to actually free the memory, so it is not -useful in systems that don't follow that behavior. +argument for madvise() to actually free the memory. This is a +Linux-specific feature, so cache-clean-interval is not supported in +other systems. diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt new file mode 100644 index 0000000000..40d7abd401 --- /dev/null +++ b/docs/qmp-commands.txt @@ -0,0 +1,3838 @@ + QMP Supported Commands + ---------------------- + +This document describes all commands currently supported by QMP. + +Most of the time their usage is exactly the same as in the user Monitor, this +means that any other document which also describe commands (the manpage, +QEMU's manual, etc) can and should be consulted. + +QMP has two types of commands: regular and query commands. Regular commands +usually change the Virtual Machine's state someway, while query commands just +return information. The sections below are divided accordingly. + +It's important to observe that all communication examples are formatted in +a reader-friendly way, so that they're easier to understand. However, in real +protocol usage, they're emitted as a single line. + +Also, the following notation is used to denote data flow: + +-> data issued by the Client +<- Server data response + +Please, refer to the QMP specification (docs/qmp-spec.txt) for detailed +information on the Server command and response formats. + +NOTE: This document is temporary and will be replaced soon. + +1. Stability Considerations +=========================== + +The current QMP command set (described in this file) may be useful for a +number of use cases, however it's limited and several commands have bad +defined semantics, specially with regard to command completion. + +These problems are going to be solved incrementally in the next QEMU releases +and we're going to establish a deprecation policy for badly defined commands. + +If you're planning to adopt QMP, please observe the following: + + 1. The deprecation policy will take effect and be documented soon, please + check the documentation of each used command as soon as a new release of + QEMU is available + + 2. DO NOT rely on anything which is not explicit documented + + 3. Errors, in special, are not documented. Applications should NOT check + for specific errors classes or data (it's strongly recommended to only + check for the "error" key) + +2. Regular Commands +=================== + +Server's responses in the examples below are always a success response, please +refer to the QMP specification for more details on error responses. + +quit +---- + +Quit the emulator. + +Arguments: None. + +Example: + +-> { "execute": "quit" } +<- { "return": {} } + +eject +----- + +Eject a removable medium. + +Arguments: + +- "force": force ejection (json-bool, optional) +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) + +Example: + +-> { "execute": "eject", "arguments": { "id": "ide0-1-0" } } +<- { "return": {} } + +Note: The "force" argument defaults to false. + +change +------ + +Change a removable medium or VNC configuration. + +Arguments: + +- "device": device name (json-string) +- "target": filename or item (json-string) +- "arg": additional argument (json-string, optional) + +Examples: + +1. Change a removable medium + +-> { "execute": "change", + "arguments": { "device": "ide1-cd0", + "target": "/srv/images/Fedora-12-x86_64-DVD.iso" } } +<- { "return": {} } + +2. Change VNC password + +-> { "execute": "change", + "arguments": { "device": "vnc", "target": "password", + "arg": "foobar1" } } +<- { "return": {} } + +screendump +---------- + +Save screen into PPM image. + +Arguments: + +- "filename": file path (json-string) + +Example: + +-> { "execute": "screendump", "arguments": { "filename": "/tmp/image" } } +<- { "return": {} } + +stop +---- + +Stop the emulator. + +Arguments: None. + +Example: + +-> { "execute": "stop" } +<- { "return": {} } + +cont +---- + +Resume emulation. + +Arguments: None. + +Example: + +-> { "execute": "cont" } +<- { "return": {} } + +system_wakeup +------------- + +Wakeup guest from suspend. + +Arguments: None. + +Example: + +-> { "execute": "system_wakeup" } +<- { "return": {} } + +system_reset +------------ + +Reset the system. + +Arguments: None. + +Example: + +-> { "execute": "system_reset" } +<- { "return": {} } + +system_powerdown +---------------- + +Send system power down event. + +Arguments: None. + +Example: + +-> { "execute": "system_powerdown" } +<- { "return": {} } + +device_add +---------- + +Add a device. + +Arguments: + +- "driver": the name of the new device's driver (json-string) +- "bus": the device's parent bus (device tree path, json-string, optional) +- "id": the device's ID, must be unique (json-string) +- device properties + +Example: + +-> { "execute": "device_add", "arguments": { "driver": "e1000", "id": "net1" } } +<- { "return": {} } + +Notes: + +(1) For detailed information about this command, please refer to the + 'docs/qdev-device-use.txt' file. + +(2) It's possible to list device properties by running QEMU with the + "-device DEVICE,\?" command-line argument, where DEVICE is the device's name + +device_del +---------- + +Remove a device. + +Arguments: + +- "id": the device's ID or QOM path (json-string) + +Example: + +-> { "execute": "device_del", "arguments": { "id": "net1" } } +<- { "return": {} } + +Example: + +-> { "execute": "device_del", "arguments": { "id": "/machine/peripheral-anon/device[0]" } } +<- { "return": {} } + +send-key +---------- + +Send keys to VM. + +Arguments: + +keys array: + - "key": key sequence (a json-array of key union values, + union can be number or qcode enum) + +- hold-time: time to delay key up events, milliseconds. Defaults to 100 + (json-int, optional) + +Example: + +-> { "execute": "send-key", + "arguments": { "keys": [ { "type": "qcode", "data": "ctrl" }, + { "type": "qcode", "data": "alt" }, + { "type": "qcode", "data": "delete" } ] } } +<- { "return": {} } + +cpu +--- + +Set the default CPU. + +Arguments: + +- "index": the CPU's index (json-int) + +Example: + +-> { "execute": "cpu", "arguments": { "index": 0 } } +<- { "return": {} } + +Note: CPUs' indexes are obtained with the 'query-cpus' command. + +cpu-add +------- + +Adds virtual cpu + +Arguments: + +- "id": cpu id (json-int) + +Example: + +-> { "execute": "cpu-add", "arguments": { "id": 2 } } +<- { "return": {} } + +memsave +------- + +Save to disk virtual memory dump starting at 'val' of size 'size'. + +Arguments: + +- "val": the starting address (json-int) +- "size": the memory size, in bytes (json-int) +- "filename": file path (json-string) +- "cpu": virtual CPU index (json-int, optional) + +Example: + +-> { "execute": "memsave", + "arguments": { "val": 10, + "size": 100, + "filename": "/tmp/virtual-mem-dump" } } +<- { "return": {} } + +pmemsave +-------- + +Save to disk physical memory dump starting at 'val' of size 'size'. + +Arguments: + +- "val": the starting address (json-int) +- "size": the memory size, in bytes (json-int) +- "filename": file path (json-string) + +Example: + +-> { "execute": "pmemsave", + "arguments": { "val": 10, + "size": 100, + "filename": "/tmp/physical-mem-dump" } } +<- { "return": {} } + +inject-nmi +---------- + +Inject an NMI on the default CPU (x86/s390) or all CPUs (ppc64). + +Arguments: None. + +Example: + +-> { "execute": "inject-nmi" } +<- { "return": {} } + +Note: inject-nmi fails when the guest doesn't support injecting. + +ringbuf-write +------------- + +Write to a ring buffer character device. + +Arguments: + +- "device": ring buffer character device name (json-string) +- "data": data to write (json-string) +- "format": data format (json-string, optional) + - Possible values: "utf8" (default), "base64" + +Example: + +-> { "execute": "ringbuf-write", + "arguments": { "device": "foo", + "data": "abcdefgh", + "format": "utf8" } } +<- { "return": {} } + +ringbuf-read +------------- + +Read from a ring buffer character device. + +Arguments: + +- "device": ring buffer character device name (json-string) +- "size": how many bytes to read at most (json-int) + - Number of data bytes, not number of characters in encoded data +- "format": data format (json-string, optional) + - Possible values: "utf8" (default), "base64" + - Naturally, format "utf8" works only when the ring buffer + contains valid UTF-8 text. Invalid UTF-8 sequences get + replaced. Bug: replacement doesn't work. Bug: can screw + up on encountering NUL characters, after the ring buffer + lost data, and when reading stops because the size limit + is reached. + +Example: + +-> { "execute": "ringbuf-read", + "arguments": { "device": "foo", + "size": 1000, + "format": "utf8" } } +<- {"return": "abcdefgh"} + +xen-save-devices-state +------- + +Save the state of all devices to file. The RAM and the block devices +of the VM are not saved by this command. + +Arguments: + +- "filename": the file to save the state of the devices to as binary +data. See xen-save-devices-state.txt for a description of the binary +format. + +Example: + +-> { "execute": "xen-save-devices-state", + "arguments": { "filename": "/tmp/save" } } +<- { "return": {} } + +xen-load-devices-state +---------------------- + +Load the state of all devices from file. The RAM and the block devices +of the VM are not loaded by this command. + +Arguments: + +- "filename": the file to load the state of the devices from as binary +data. See xen-save-devices-state.txt for a description of the binary +format. + +Example: + +-> { "execute": "xen-load-devices-state", + "arguments": { "filename": "/tmp/resume" } } +<- { "return": {} } + +xen-set-global-dirty-log +------- + +Enable or disable the global dirty log mode. + +Arguments: + +- "enable": Enable it or disable it. + +Example: + +-> { "execute": "xen-set-global-dirty-log", + "arguments": { "enable": true } } +<- { "return": {} } + +migrate +------- + +Migrate to URI. + +Arguments: + +- "blk": block migration, full disk copy (json-bool, optional) +- "inc": incremental disk copy (json-bool, optional) +- "uri": Destination URI (json-string) + +Example: + +-> { "execute": "migrate", "arguments": { "uri": "tcp:0:4446" } } +<- { "return": {} } + +Notes: + +(1) The 'query-migrate' command should be used to check migration's progress + and final result (this information is provided by the 'status' member) +(2) All boolean arguments default to false +(3) The user Monitor's "detach" argument is invalid in QMP and should not + be used + +migrate_cancel +-------------- + +Cancel the current migration. + +Arguments: None. + +Example: + +-> { "execute": "migrate_cancel" } +<- { "return": {} } + +migrate-incoming +---------------- + +Continue an incoming migration + +Arguments: + +- "uri": Source/listening URI (json-string) + +Example: + +-> { "execute": "migrate-incoming", "arguments": { "uri": "tcp::4446" } } +<- { "return": {} } + +Notes: + +(1) QEMU must be started with -incoming defer to allow migrate-incoming to + be used +(2) The uri format is the same as for -incoming + +migrate-set-cache-size +---------------------- + +Set cache size to be used by XBZRLE migration, the cache size will be rounded +down to the nearest power of 2 + +Arguments: + +- "value": cache size in bytes (json-int) + +Example: + +-> { "execute": "migrate-set-cache-size", "arguments": { "value": 536870912 } } +<- { "return": {} } + +migrate-start-postcopy +---------------------- + +Switch an in-progress migration to postcopy mode. Ignored after the end of +migration (or once already in postcopy). + +Example: +-> { "execute": "migrate-start-postcopy" } +<- { "return": {} } + +query-migrate-cache-size +------------------------ + +Show cache size to be used by XBZRLE migration + +returns a json-object with the following information: +- "size" : json-int + +Example: + +-> { "execute": "query-migrate-cache-size" } +<- { "return": 67108864 } + +migrate_set_speed +----------------- + +Set maximum speed for migrations. + +Arguments: + +- "value": maximum speed, in bytes per second (json-int) + +Example: + +-> { "execute": "migrate_set_speed", "arguments": { "value": 1024 } } +<- { "return": {} } + +migrate_set_downtime +-------------------- + +Set maximum tolerated downtime (in seconds) for migrations. + +Arguments: + +- "value": maximum downtime (json-number) + +Example: + +-> { "execute": "migrate_set_downtime", "arguments": { "value": 0.1 } } +<- { "return": {} } + +x-colo-lost-heartbeat +-------------------- + +Tell COLO that heartbeat is lost, a failover or takeover is needed. + +Example: + +-> { "execute": "x-colo-lost-heartbeat" } +<- { "return": {} } + +client_migrate_info +------------------- + +Set migration information for remote display. This makes the server +ask the client to automatically reconnect using the new parameters +once migration finished successfully. Only implemented for SPICE. + +Arguments: + +- "protocol": must be "spice" (json-string) +- "hostname": migration target hostname (json-string) +- "port": spice tcp port for plaintext channels (json-int, optional) +- "tls-port": spice tcp port for tls-secured channels (json-int, optional) +- "cert-subject": server certificate subject (json-string, optional) + +Example: + +-> { "execute": "client_migrate_info", + "arguments": { "protocol": "spice", + "hostname": "virt42.lab.kraxel.org", + "port": 1234 } } +<- { "return": {} } + +dump + + +Dump guest memory to file. The file can be processed with crash or gdb. + +Arguments: + +- "paging": do paging to get guest's memory mapping (json-bool) +- "protocol": destination file(started with "file:") or destination file + descriptor (started with "fd:") (json-string) +- "detach": if specified, command will return immediately, without waiting + for the dump to finish. The user can track progress using + "query-dump". (json-bool) +- "begin": the starting physical address. It's optional, and should be specified + with length together (json-int) +- "length": the memory size, in bytes. It's optional, and should be specified + with begin together (json-int) +- "format": the format of guest memory dump. It's optional, and can be + elf|kdump-zlib|kdump-lzo|kdump-snappy, but non-elf formats will + conflict with paging and filter, ie. begin and length (json-string) + +Example: + +-> { "execute": "dump-guest-memory", "arguments": { "protocol": "fd:dump" } } +<- { "return": {} } + +Notes: + +(1) All boolean arguments default to false + +query-dump-guest-memory-capability +---------- + +Show available formats for 'dump-guest-memory' + +Example: + +-> { "execute": "query-dump-guest-memory-capability" } +<- { "return": { "formats": + ["elf", "kdump-zlib", "kdump-lzo", "kdump-snappy"] } + +query-dump +---------- + +Query background dump status. + +Arguments: None. + +Example: + +-> { "execute": "query-dump" } +<- { "return": { "status": "active", "completed": 1024000, + "total": 2048000 } } + +dump-skeys +---------- + +Save guest storage keys to file. + +Arguments: + +- "filename": file path (json-string) + +Example: + +-> { "execute": "dump-skeys", "arguments": { "filename": "/tmp/skeys" } } +<- { "return": {} } + +netdev_add +---------- + +Add host network device. + +Arguments: + +- "type": the device type, "tap", "user", ... (json-string) +- "id": the device's ID, must be unique (json-string) +- device options + +Example: + +-> { "execute": "netdev_add", + "arguments": { "type": "user", "id": "netdev1", + "dnssearch": "example.org" } } +<- { "return": {} } + +Note: The supported device options are the same ones supported by the '-netdev' + command-line argument, which are listed in the '-help' output or QEMU's + manual + +netdev_del +---------- + +Remove host network device. + +Arguments: + +- "id": the device's ID, must be unique (json-string) + +Example: + +-> { "execute": "netdev_del", "arguments": { "id": "netdev1" } } +<- { "return": {} } + + +object-add +---------- + +Create QOM object. + +Arguments: + +- "qom-type": the object's QOM type, i.e. the class name (json-string) +- "id": the object's ID, must be unique (json-string) +- "props": a dictionary of object property values (optional, json-dict) + +Example: + +-> { "execute": "object-add", "arguments": { "qom-type": "rng-random", "id": "rng1", + "props": { "filename": "/dev/hwrng" } } } +<- { "return": {} } + +object-del +---------- + +Remove QOM object. + +Arguments: + +- "id": the object's ID (json-string) + +Example: + +-> { "execute": "object-del", "arguments": { "id": "rng1" } } +<- { "return": {} } + + +block_resize +------------ + +Resize a block image while a guest is running. + +Arguments: + +- "device": the device's ID, must be unique (json-string) +- "node-name": the node name in the block driver state graph (json-string) +- "size": new size + +Example: + +-> { "execute": "block_resize", "arguments": { "device": "scratch", "size": 1073741824 } } +<- { "return": {} } + +block-stream +------------ + +Copy data from a backing file into a block device. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": The device name or node-name of a root node (json-string) +- "base": The file name of the backing image above which copying starts. + It cannot be set if 'base-node' is also set (json-string, optional) +- "base-node": the node name of the backing image above which copying starts. + It cannot be set if 'base' is also set. + (json-string, optional) (Since 2.8) +- "backing-file": The backing file string to write into the active layer. This + filename is not validated. + + If a pathname string is such that it cannot be resolved by + QEMU, that means that subsequent QMP or HMP commands must use + node-names for the image in question, as filename lookup + methods will fail. + + If not specified, QEMU will automatically determine the + backing file string to use, or error out if there is no + obvious choice. Care should be taken when specifying the + string, to specify a valid filename or protocol. + (json-string, optional) (Since 2.1) +- "speed": the maximum speed, in bytes per second (json-int, optional) +- "on-error": the action to take on an error (default 'report'). 'stop' and + 'enospc' can only be used if the block device supports io-status. + (json-string, optional) (Since 2.1) + +Example: + +-> { "execute": "block-stream", "arguments": { "device": "virtio0", + "base": "/tmp/master.qcow2" } } +<- { "return": {} } + +block-commit +------------ + +Live commit of data from overlay image nodes into backing nodes - i.e., writes +data between 'top' and 'base' into 'base'. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": The device name or node-name of a root node (json-string) +- "base": The file name of the backing image to write data into. + If not specified, this is the deepest backing image + (json-string, optional) +- "top": The file name of the backing image within the image chain, + which contains the topmost data to be committed down. If + not specified, this is the active layer. (json-string, optional) + +- backing-file: The backing file string to write into the overlay + image of 'top'. If 'top' is the active layer, + specifying a backing file string is an error. This + filename is not validated. + + If a pathname string is such that it cannot be + resolved by QEMU, that means that subsequent QMP or + HMP commands must use node-names for the image in + question, as filename lookup methods will fail. + + If not specified, QEMU will automatically determine + the backing file string to use, or error out if + there is no obvious choice. Care should be taken + when specifying the string, to specify a valid + filename or protocol. + (json-string, optional) (Since 2.1) + + If top == base, that is an error. + If top == active, the job will not be completed by itself, + user needs to complete the job with the block-job-complete + command after getting the ready event. (Since 2.0) + + If the base image is smaller than top, then the base image + will be resized to be the same size as top. If top is + smaller than the base image, the base will not be + truncated. If you want the base image size to match the + size of the smaller top, you can safely truncate it + yourself once the commit operation successfully completes. + (json-string) +- "speed": the maximum speed, in bytes per second (json-int, optional) + + +Example: + +-> { "execute": "block-commit", "arguments": { "device": "virtio0", + "top": "/tmp/snap1.qcow2" } } +<- { "return": {} } + +drive-backup +------------ + +Start a point-in-time copy of a block device to a new destination. The +status of ongoing drive-backup operations can be checked with +query-block-jobs where the BlockJobInfo.type field has the value 'backup'. +The operation can be stopped before it has completed using the +block-job-cancel command. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": the device name or node-name of a root node which should be copied. + (json-string) +- "target": the target of the new image. If the file exists, or if it is a + device, the existing file/device will be used as the new + destination. If it does not exist, a new file will be created. + (json-string) +- "format": the format of the new destination, default is to probe if 'mode' is + 'existing', else the format of the source + (json-string, optional) +- "sync": what parts of the disk image should be copied to the destination; + possibilities include "full" for all the disk, "top" for only the sectors + allocated in the topmost image, "incremental" for only the dirty sectors in + the bitmap, or "none" to only replicate new I/O (MirrorSyncMode). +- "bitmap": dirty bitmap name for sync==incremental. Must be present if sync + is "incremental", must NOT be present otherwise. +- "mode": whether and how QEMU should create a new image + (NewImageMode, optional, default 'absolute-paths') +- "speed": the maximum speed, in bytes per second (json-int, optional) +- "compress": true to compress data, if the target format supports it. + (json-bool, optional, default false) +- "on-source-error": the action to take on an error on the source, default + 'report'. 'stop' and 'enospc' can only be used + if the block device supports io-status. + (BlockdevOnError, optional) +- "on-target-error": the action to take on an error on the target, default + 'report' (no limitations, since this applies to + a different block device than device). + (BlockdevOnError, optional) + +Example: +-> { "execute": "drive-backup", "arguments": { "device": "drive0", + "sync": "full", + "target": "backup.img" } } +<- { "return": {} } + +blockdev-backup +--------------- + +The device version of drive-backup: this command takes an existing named device +as backup target. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": the device name or node-name of a root node which should be copied. + (json-string) +- "target": the name of the backup target device. (json-string) +- "sync": what parts of the disk image should be copied to the destination; + possibilities include "full" for all the disk, "top" for only the + sectors allocated in the topmost image, or "none" to only replicate + new I/O (MirrorSyncMode). +- "speed": the maximum speed, in bytes per second (json-int, optional) +- "compress": true to compress data, if the target format supports it. + (json-bool, optional, default false) +- "on-source-error": the action to take on an error on the source, default + 'report'. 'stop' and 'enospc' can only be used + if the block device supports io-status. + (BlockdevOnError, optional) +- "on-target-error": the action to take on an error on the target, default + 'report' (no limitations, since this applies to + a different block device than device). + (BlockdevOnError, optional) + +Example: +-> { "execute": "blockdev-backup", "arguments": { "device": "src-id", + "sync": "full", + "target": "tgt-id" } } +<- { "return": {} } + +transaction +----------- + +Atomically operate on one or more block devices. Operations that are +currently supported: + + - drive-backup + - blockdev-backup + - blockdev-snapshot-sync + - blockdev-snapshot-internal-sync + - abort + - block-dirty-bitmap-add + - block-dirty-bitmap-clear + +Refer to the qemu/qapi-schema.json file for minimum required QEMU +versions for these operations. A list of dictionaries is accepted, +that contains the actions to be performed. If there is any failure +performing any of the operations, all operations for the group are +abandoned. + +For external snapshots, the dictionary contains the device, the file to use for +the new snapshot, and the format. The default format, if not specified, is +qcow2. + +Each new snapshot defaults to being created by QEMU (wiping any +contents if the file already exists), but it is also possible to reuse +an externally-created file. In the latter case, you should ensure that +the new image file has the same contents as the current one; QEMU cannot +perform any meaningful check. Typically this is achieved by using the +current image file as the backing file for the new image. + +On failure, the original disks pre-snapshot attempt will be used. + +For internal snapshots, the dictionary contains the device and the snapshot's +name. If an internal snapshot matching name already exists, the request will +be rejected. Only some image formats support it, for example, qcow2, rbd, +and sheepdog. + +On failure, qemu will try delete the newly created internal snapshot in the +transaction. When an I/O error occurs during deletion, the user needs to fix +it later with qemu-img or other command. + +Arguments: + +actions array: + - "type": the operation to perform (json-string). Possible + values: "drive-backup", "blockdev-backup", + "blockdev-snapshot-sync", + "blockdev-snapshot-internal-sync", + "abort", "block-dirty-bitmap-add", + "block-dirty-bitmap-clear" + - "data": a dictionary. The contents depend on the value + of "type". When "type" is "blockdev-snapshot-sync": + - "device": device name to snapshot (json-string) + - "node-name": graph node name to snapshot (json-string) + - "snapshot-file": name of new image file (json-string) + - "snapshot-node-name": graph node name of the new snapshot (json-string) + - "format": format of new image (json-string, optional) + - "mode": whether and how QEMU should create the snapshot file + (NewImageMode, optional, default "absolute-paths") + When "type" is "blockdev-snapshot-internal-sync": + - "device": the device name or node-name of a root node to snapshot + (json-string) + - "name": name of the new snapshot (json-string) + +Example: + +-> { "execute": "transaction", + "arguments": { "actions": [ + { "type": "blockdev-snapshot-sync", "data" : { "device": "ide-hd0", + "snapshot-file": "/some/place/my-image", + "format": "qcow2" } }, + { "type": "blockdev-snapshot-sync", "data" : { "node-name": "myfile", + "snapshot-file": "/some/place/my-image2", + "snapshot-node-name": "node3432", + "mode": "existing", + "format": "qcow2" } }, + { "type": "blockdev-snapshot-sync", "data" : { "device": "ide-hd1", + "snapshot-file": "/some/place/my-image2", + "mode": "existing", + "format": "qcow2" } }, + { "type": "blockdev-snapshot-internal-sync", "data" : { + "device": "ide-hd2", + "name": "snapshot0" } } ] } } +<- { "return": {} } + +block-dirty-bitmap-add +---------------------- +Since 2.4 + +Create a dirty bitmap with a name on the device, and start tracking the writes. + +Arguments: + +- "node": device/node on which to create dirty bitmap (json-string) +- "name": name of the new dirty bitmap (json-string) +- "granularity": granularity to track writes with (int, optional) + +Example: + +-> { "execute": "block-dirty-bitmap-add", "arguments": { "node": "drive0", + "name": "bitmap0" } } +<- { "return": {} } + +block-dirty-bitmap-remove +------------------------- +Since 2.4 + +Stop write tracking and remove the dirty bitmap that was created with +block-dirty-bitmap-add. + +Arguments: + +- "node": device/node on which to remove dirty bitmap (json-string) +- "name": name of the dirty bitmap to remove (json-string) + +Example: + +-> { "execute": "block-dirty-bitmap-remove", "arguments": { "node": "drive0", + "name": "bitmap0" } } +<- { "return": {} } + +block-dirty-bitmap-clear +------------------------ +Since 2.4 + +Reset the dirty bitmap associated with a node so that an incremental backup +from this point in time forward will only backup clusters modified after this +clear operation. + +Arguments: + +- "node": device/node on which to remove dirty bitmap (json-string) +- "name": name of the dirty bitmap to remove (json-string) + +Example: + +-> { "execute": "block-dirty-bitmap-clear", "arguments": { "node": "drive0", + "name": "bitmap0" } } +<- { "return": {} } + +blockdev-snapshot-sync +---------------------- + +Synchronous snapshot of a block device. snapshot-file specifies the +target of the new image. If the file exists, or if it is a device, the +snapshot will be created in the existing file/device. If does not +exist, a new file will be created. format specifies the format of the +snapshot image, default is qcow2. + +Arguments: + +- "device": device name to snapshot (json-string) +- "node-name": graph node name to snapshot (json-string) +- "snapshot-file": name of new image file (json-string) +- "snapshot-node-name": graph node name of the new snapshot (json-string) +- "mode": whether and how QEMU should create the snapshot file + (NewImageMode, optional, default "absolute-paths") +- "format": format of new image (json-string, optional) + +Example: + +-> { "execute": "blockdev-snapshot-sync", "arguments": { "device": "ide-hd0", + "snapshot-file": + "/some/place/my-image", + "format": "qcow2" } } +<- { "return": {} } + +blockdev-snapshot +----------------- +Since 2.5 + +Create a snapshot, by installing 'node' as the backing image of +'overlay'. Additionally, if 'node' is associated with a block +device, the block device changes to using 'overlay' as its new active +image. + +Arguments: + +- "node": device that will have a snapshot created (json-string) +- "overlay": device that will have 'node' as its backing image (json-string) + +Example: + +-> { "execute": "blockdev-add", + "arguments": { "driver": "qcow2", + "node-name": "node1534", + "file": { "driver": "file", + "filename": "hd1.qcow2" }, + "backing": "" } } + +<- { "return": {} } + +-> { "execute": "blockdev-snapshot", "arguments": { "node": "ide-hd0", + "overlay": "node1534" } } +<- { "return": {} } + +blockdev-snapshot-internal-sync +------------------------------- + +Synchronously take an internal snapshot of a block device when the format of +image used supports it. If the name is an empty string, or a snapshot with +name already exists, the operation will fail. + +Arguments: + +- "device": the device name or node-name of a root node to snapshot + (json-string) +- "name": name of the new snapshot (json-string) + +Example: + +-> { "execute": "blockdev-snapshot-internal-sync", + "arguments": { "device": "ide-hd0", + "name": "snapshot0" } + } +<- { "return": {} } + +blockdev-snapshot-delete-internal-sync +-------------------------------------- + +Synchronously delete an internal snapshot of a block device when the format of +image used supports it. The snapshot is identified by name or id or both. One +of name or id is required. If the snapshot is not found, the operation will +fail. + +Arguments: + +- "device": the device name or node-name of a root node (json-string) +- "id": ID of the snapshot (json-string, optional) +- "name": name of the snapshot (json-string, optional) + +Example: + +-> { "execute": "blockdev-snapshot-delete-internal-sync", + "arguments": { "device": "ide-hd0", + "name": "snapshot0" } + } +<- { "return": { + "id": "1", + "name": "snapshot0", + "vm-state-size": 0, + "date-sec": 1000012, + "date-nsec": 10, + "vm-clock-sec": 100, + "vm-clock-nsec": 20 + } + } + +drive-mirror +------------ + +Start mirroring a block device's writes to a new destination. target +specifies the target of the new image. If the file exists, or if it is +a device, it will be used as the new destination for writes. If it does not +exist, a new file will be created. format specifies the format of the +mirror image, default is to probe if mode='existing', else the format +of the source. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": the device name or node-name of a root node whose writes should be + mirrored. (json-string) +- "target": name of new image file (json-string) +- "format": format of new image (json-string, optional) +- "node-name": the name of the new block driver state in the node graph + (json-string, optional) +- "replaces": the block driver node name to replace when finished + (json-string, optional) +- "mode": how an image file should be created into the target + file/device (NewImageMode, optional, default 'absolute-paths') +- "speed": maximum speed of the streaming job, in bytes per second + (json-int) +- "granularity": granularity of the dirty bitmap, in bytes (json-int, optional) +- "buf-size": maximum amount of data in flight from source to target, in bytes + (json-int, default 10M) +- "sync": what parts of the disk image should be copied to the destination; + possibilities include "full" for all the disk, "top" for only the sectors + allocated in the topmost image, or "none" to only replicate new I/O + (MirrorSyncMode). +- "on-source-error": the action to take on an error on the source + (BlockdevOnError, default 'report') +- "on-target-error": the action to take on an error on the target + (BlockdevOnError, default 'report') +- "unmap": whether the target sectors should be discarded where source has only + zeroes. (json-bool, optional, default true) + +The default value of the granularity is the image cluster size clamped +between 4096 and 65536, if the image format defines one. If the format +does not define a cluster size, the default value of the granularity +is 65536. + + +Example: + +-> { "execute": "drive-mirror", "arguments": { "device": "ide-hd0", + "target": "/some/place/my-image", + "sync": "full", + "format": "qcow2" } } +<- { "return": {} } + +blockdev-mirror +------------ + +Start mirroring a block device's writes to another block device. target +specifies the target of mirror operation. + +Arguments: + +- "job-id": Identifier for the newly-created block job. If omitted, + the device name will be used. (json-string, optional) +- "device": The device name or node-name of a root node whose writes should be + mirrored (json-string) +- "target": device name to mirror to (json-string) +- "replaces": the block driver node name to replace when finished + (json-string, optional) +- "speed": maximum speed of the streaming job, in bytes per second + (json-int) +- "granularity": granularity of the dirty bitmap, in bytes (json-int, optional) +- "buf_size": maximum amount of data in flight from source to target, in bytes + (json-int, default 10M) +- "sync": what parts of the disk image should be copied to the destination; + possibilities include "full" for all the disk, "top" for only the sectors + allocated in the topmost image, or "none" to only replicate new I/O + (MirrorSyncMode). +- "on-source-error": the action to take on an error on the source + (BlockdevOnError, default 'report') +- "on-target-error": the action to take on an error on the target + (BlockdevOnError, default 'report') + +The default value of the granularity is the image cluster size clamped +between 4096 and 65536, if the image format defines one. If the format +does not define a cluster size, the default value of the granularity +is 65536. + +Example: + +-> { "execute": "blockdev-mirror", "arguments": { "device": "ide-hd0", + "target": "target0", + "sync": "full" } } +<- { "return": {} } + +change-backing-file +------------------- +Since: 2.1 + +Change the backing file in the image file metadata. This does not cause +QEMU to reopen the image file to reparse the backing filename (it may, +however, perform a reopen to change permissions from r/o -> r/w -> r/o, +if needed). The new backing file string is written into the image file +metadata, and the QEMU internal strings are updated. + +Arguments: + +- "image-node-name": The name of the block driver state node of the + image to modify. The "device" is argument is used to + verify "image-node-name" is in the chain described by + "device". + (json-string, optional) + +- "device": The device name or node-name of the root node that owns + image-node-name. + (json-string) + +- "backing-file": The string to write as the backing file. This string is + not validated, so care should be taken when specifying + the string or the image chain may not be able to be + reopened again. + (json-string) + +Returns: Nothing on success + If "device" does not exist or cannot be determined, DeviceNotFound + +balloon +------- + +Request VM to change its memory allocation (in bytes). + +Arguments: + +- "value": New memory allocation (json-int) + +Example: + +-> { "execute": "balloon", "arguments": { "value": 536870912 } } +<- { "return": {} } + +set_link +-------- + +Change the link status of a network adapter. + +Arguments: + +- "name": network device name (json-string) +- "up": status is up (json-bool) + +Example: + +-> { "execute": "set_link", "arguments": { "name": "e1000.0", "up": false } } +<- { "return": {} } + +get_link +-------- + +Get the link status of a network adapter. + +Arguments: + +- "name": network device name (json-string) + +Example: + +-> { "execute": "get_link", "arguments": { "name": "e1000.0" } } +<- { "return": {on|off} } + +getfd +----- + +Receive a file descriptor via SCM rights and assign it a name. + +Arguments: + +- "fdname": file descriptor name (json-string) + +Example: + +-> { "execute": "getfd", "arguments": { "fdname": "fd1" } } +<- { "return": {} } + +Notes: + +(1) If the name specified by the "fdname" argument already exists, + the file descriptor assigned to it will be closed and replaced + by the received file descriptor. +(2) The 'closefd' command can be used to explicitly close the file + descriptor when it is no longer needed. + +closefd +------- + +Close a file descriptor previously passed via SCM rights. + +Arguments: + +- "fdname": file descriptor name (json-string) + +Example: + +-> { "execute": "closefd", "arguments": { "fdname": "fd1" } } +<- { "return": {} } + +add-fd +------- + +Add a file descriptor, that was passed via SCM rights, to an fd set. + +Arguments: + +- "fdset-id": The ID of the fd set to add the file descriptor to. + (json-int, optional) +- "opaque": A free-form string that can be used to describe the fd. + (json-string, optional) + +Return a json-object with the following information: + +- "fdset-id": The ID of the fd set that the fd was added to. (json-int) +- "fd": The file descriptor that was received via SCM rights and added to the + fd set. (json-int) + +Example: + +-> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } +<- { "return": { "fdset-id": 1, "fd": 3 } } + +Notes: + +(1) The list of fd sets is shared by all monitor connections. +(2) If "fdset-id" is not specified, a new fd set will be created. + +remove-fd +--------- + +Remove a file descriptor from an fd set. + +Arguments: + +- "fdset-id": The ID of the fd set that the file descriptor belongs to. + (json-int) +- "fd": The file descriptor that is to be removed. (json-int, optional) + +Example: + +-> { "execute": "remove-fd", "arguments": { "fdset-id": 1, "fd": 3 } } +<- { "return": {} } + +Notes: + +(1) The list of fd sets is shared by all monitor connections. +(2) If "fd" is not specified, all file descriptors in "fdset-id" will be + removed. + +query-fdsets +------------- + +Return information describing all fd sets. + +Arguments: None + +Example: + +-> { "execute": "query-fdsets" } +<- { "return": [ + { + "fds": [ + { + "fd": 30, + "opaque": "rdonly:/path/to/file" + }, + { + "fd": 24, + "opaque": "rdwr:/path/to/file" + } + ], + "fdset-id": 1 + }, + { + "fds": [ + { + "fd": 28 + }, + { + "fd": 29 + } + ], + "fdset-id": 0 + } + ] + } + +Note: The list of fd sets is shared by all monitor connections. + +block_passwd +------------ + +Set the password of encrypted block devices. + +Arguments: + +- "device": device name (json-string) +- "node-name": name in the block driver state graph (json-string) +- "password": password (json-string) + +Example: + +-> { "execute": "block_passwd", "arguments": { "device": "ide0-hd0", + "password": "12345" } } +<- { "return": {} } + +block_set_io_throttle +------------ + +Change I/O throttle limits for a block drive. + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) +- "bps": total throughput limit in bytes per second (json-int) +- "bps_rd": read throughput limit in bytes per second (json-int) +- "bps_wr": write throughput limit in bytes per second (json-int) +- "iops": total I/O operations per second (json-int) +- "iops_rd": read I/O operations per second (json-int) +- "iops_wr": write I/O operations per second (json-int) +- "bps_max": total throughput limit during bursts, in bytes (json-int, optional) +- "bps_rd_max": read throughput limit during bursts, in bytes (json-int, optional) +- "bps_wr_max": write throughput limit during bursts, in bytes (json-int, optional) +- "iops_max": total I/O operations per second during bursts (json-int, optional) +- "iops_rd_max": read I/O operations per second during bursts (json-int, optional) +- "iops_wr_max": write I/O operations per second during bursts (json-int, optional) +- "bps_max_length": maximum length of the @bps_max burst period, in seconds (json-int, optional) +- "bps_rd_max_length": maximum length of the @bps_rd_max burst period, in seconds (json-int, optional) +- "bps_wr_max_length": maximum length of the @bps_wr_max burst period, in seconds (json-int, optional) +- "iops_max_length": maximum length of the @iops_max burst period, in seconds (json-int, optional) +- "iops_rd_max_length": maximum length of the @iops_rd_max burst period, in seconds (json-int, optional) +- "iops_wr_max_length": maximum length of the @iops_wr_max burst period, in seconds (json-int, optional) +- "iops_size": I/O size in bytes when limiting (json-int, optional) +- "group": throttle group name (json-string, optional) + +Example: + +-> { "execute": "block_set_io_throttle", "arguments": { "id": "ide0-1-0", + "bps": 1000000, + "bps_rd": 0, + "bps_wr": 0, + "iops": 0, + "iops_rd": 0, + "iops_wr": 0, + "bps_max": 8000000, + "bps_rd_max": 0, + "bps_wr_max": 0, + "iops_max": 0, + "iops_rd_max": 0, + "iops_wr_max": 0, + "bps_max_length": 60, + "iops_size": 0 } } +<- { "return": {} } + +set_password +------------ + +Set the password for vnc/spice protocols. + +Arguments: + +- "protocol": protocol name (json-string) +- "password": password (json-string) +- "connected": [ keep | disconnect | fail ] (json-string, optional) + +Example: + +-> { "execute": "set_password", "arguments": { "protocol": "vnc", + "password": "secret" } } +<- { "return": {} } + +expire_password +--------------- + +Set the password expire time for vnc/spice protocols. + +Arguments: + +- "protocol": protocol name (json-string) +- "time": [ now | never | +secs | secs ] (json-string) + +Example: + +-> { "execute": "expire_password", "arguments": { "protocol": "vnc", + "time": "+60" } } +<- { "return": {} } + +add_client +---------- + +Add a graphics client + +Arguments: + +- "protocol": protocol name (json-string) +- "fdname": file descriptor name (json-string) +- "skipauth": whether to skip authentication (json-bool, optional) +- "tls": whether to perform TLS (json-bool, optional) + +Example: + +-> { "execute": "add_client", "arguments": { "protocol": "vnc", + "fdname": "myclient" } } +<- { "return": {} } + +qmp_capabilities +---------------- + +Enable QMP capabilities. + +Arguments: None. + +Example: + +-> { "execute": "qmp_capabilities" } +<- { "return": {} } + +Note: This command must be issued before issuing any other command. + +human-monitor-command +--------------------- + +Execute a Human Monitor command. + +Arguments: + +- command-line: the command name and its arguments, just like the + Human Monitor's shell (json-string) +- cpu-index: select the CPU number to be used by commands which access CPU + data, like 'info registers'. The Monitor selects CPU 0 if this + argument is not provided (json-int, optional) + +Example: + +-> { "execute": "human-monitor-command", "arguments": { "command-line": "info kvm" } } +<- { "return": "kvm support: enabled\r\n" } + +Notes: + +(1) The Human Monitor is NOT an stable interface, this means that command + names, arguments and responses can change or be removed at ANY time. + Applications that rely on long term stability guarantees should NOT + use this command + +(2) Limitations: + + o This command is stateless, this means that commands that depend + on state information (such as getfd) might not work + + o Commands that prompt the user for data (eg. 'cont' when the block + device is encrypted) don't currently work + +3. Query Commands +================= + + +query-version +------------- + +Show QEMU version. + +Return a json-object with the following information: + +- "qemu": A json-object containing three integer values: + - "major": QEMU's major version (json-int) + - "minor": QEMU's minor version (json-int) + - "micro": QEMU's micro version (json-int) +- "package": package's version (json-string) + +Example: + +-> { "execute": "query-version" } +<- { + "return":{ + "qemu":{ + "major":0, + "minor":11, + "micro":5 + }, + "package":"" + } + } + +query-commands +-------------- + +List QMP available commands. + +Each command is represented by a json-object, the returned value is a json-array +of all commands. + +Each json-object contain: + +- "name": command's name (json-string) + +Example: + +-> { "execute": "query-commands" } +<- { + "return":[ + { + "name":"query-balloon" + }, + { + "name":"system_powerdown" + } + ] + } + +Note: This example has been shortened as the real response is too long. + +query-events +-------------- + +List QMP available events. + +Each event is represented by a json-object, the returned value is a json-array +of all events. + +Each json-object contains: + +- "name": event's name (json-string) + +Example: + +-> { "execute": "query-events" } +<- { + "return":[ + { + "name":"SHUTDOWN" + }, + { + "name":"RESET" + } + ] + } + +Note: This example has been shortened as the real response is too long. + +query-qmp-schema +---------------- + +Return the QMP wire schema. The returned value is a json-array of +named schema entities. Entities are commands, events and various +types. See docs/qapi-code-gen.txt for information on their structure +and intended use. + +query-chardev +------------- + +Each device is represented by a json-object. The returned value is a json-array +of all devices. + +Each json-object contain the following: + +- "label": device's label (json-string) +- "filename": device's file (json-string) +- "frontend-open": open/closed state of the frontend device attached to this + backend (json-bool) + +Example: + +-> { "execute": "query-chardev" } +<- { + "return": [ + { + "label": "charchannel0", + "filename": "unix:/var/lib/libvirt/qemu/seabios.rhel6.agent,server", + "frontend-open": false + }, + { + "label": "charmonitor", + "filename": "unix:/var/lib/libvirt/qemu/seabios.rhel6.monitor,server", + "frontend-open": true + }, + { + "label": "charserial0", + "filename": "pty:/dev/pts/2", + "frontend-open": true + } + ] + } + +query-chardev-backends +------------- + +List available character device backends. + +Each backend is represented by a json-object, the returned value is a json-array +of all backends. + +Each json-object contains: + +- "name": backend name (json-string) + +Example: + +-> { "execute": "query-chardev-backends" } +<- { + "return":[ + { + "name":"udp" + }, + { + "name":"tcp" + }, + { + "name":"unix" + }, + { + "name":"spiceport" + } + ] + } + +query-block +----------- + +Show the block devices. + +Each block device information is stored in a json-object and the returned value +is a json-array of all devices. + +Each json-object contain the following: + +- "device": device name (json-string) +- "type": device type (json-string) + - deprecated, retained for backward compatibility + - Possible values: "unknown" +- "removable": true if the device is removable, false otherwise (json-bool) +- "locked": true if the device is locked, false otherwise (json-bool) +- "tray_open": only present if removable, true if the device has a tray, + and it is open (json-bool) +- "inserted": only present if the device is inserted, it is a json-object + containing the following: + - "file": device file name (json-string) + - "ro": true if read-only, false otherwise (json-bool) + - "drv": driver format name (json-string) + - Possible values: "blkdebug", "bochs", "cloop", "dmg", + "file", "file", "ftp", "ftps", "host_cdrom", + "host_device", "http", "https", + "nbd", "parallels", "qcow", "qcow2", "raw", + "vdi", "vmdk", "vpc", "vvfat" + - "backing_file": backing file name (json-string, optional) + - "backing_file_depth": number of files in the backing file chain (json-int) + - "encrypted": true if encrypted, false otherwise (json-bool) + - "bps": limit total bytes per second (json-int) + - "bps_rd": limit read bytes per second (json-int) + - "bps_wr": limit write bytes per second (json-int) + - "iops": limit total I/O operations per second (json-int) + - "iops_rd": limit read operations per second (json-int) + - "iops_wr": limit write operations per second (json-int) + - "bps_max": total max in bytes (json-int) + - "bps_rd_max": read max in bytes (json-int) + - "bps_wr_max": write max in bytes (json-int) + - "iops_max": total I/O operations max (json-int) + - "iops_rd_max": read I/O operations max (json-int) + - "iops_wr_max": write I/O operations max (json-int) + - "iops_size": I/O size when limiting by iops (json-int) + - "detect_zeroes": detect and optimize zero writing (json-string) + - Possible values: "off", "on", "unmap" + - "write_threshold": write offset threshold in bytes, a event will be + emitted if crossed. Zero if disabled (json-int) + - "image": the detail of the image, it is a json-object containing + the following: + - "filename": image file name (json-string) + - "format": image format (json-string) + - "virtual-size": image capacity in bytes (json-int) + - "dirty-flag": true if image is not cleanly closed, not present + means clean (json-bool, optional) + - "actual-size": actual size on disk in bytes of the image, not + present when image does not support thin + provision (json-int, optional) + - "cluster-size": size of a cluster in bytes, not present if image + format does not support it (json-int, optional) + - "encrypted": true if the image is encrypted, not present means + false or the image format does not support + encryption (json-bool, optional) + - "backing_file": backing file name, not present means no backing + file is used or the image format does not + support backing file chain + (json-string, optional) + - "full-backing-filename": full path of the backing file, not + present if it equals backing_file or no + backing file is used + (json-string, optional) + - "backing-filename-format": the format of the backing file, not + present means unknown or no backing + file (json-string, optional) + - "snapshots": the internal snapshot info, it is an optional list + of json-object containing the following: + - "id": unique snapshot id (json-string) + - "name": snapshot name (json-string) + - "vm-state-size": size of the VM state in bytes (json-int) + - "date-sec": UTC date of the snapshot in seconds (json-int) + - "date-nsec": fractional part in nanoseconds to be used with + date-sec (json-int) + - "vm-clock-sec": VM clock relative to boot in seconds + (json-int) + - "vm-clock-nsec": fractional part in nanoseconds to be used + with vm-clock-sec (json-int) + - "backing-image": the detail of the backing image, it is an + optional json-object only present when a + backing image present for this image + +- "io-status": I/O operation status, only present if the device supports it + and the VM is configured to stop on errors. It's always reset + to "ok" when the "cont" command is issued (json_string, optional) + - Possible values: "ok", "failed", "nospace" + +Example: + +-> { "execute": "query-block" } +<- { + "return":[ + { + "io-status": "ok", + "device":"ide0-hd0", + "locked":false, + "removable":false, + "inserted":{ + "ro":false, + "drv":"qcow2", + "encrypted":false, + "file":"disks/test.qcow2", + "backing_file_depth":1, + "bps":1000000, + "bps_rd":0, + "bps_wr":0, + "iops":1000000, + "iops_rd":0, + "iops_wr":0, + "bps_max": 8000000, + "bps_rd_max": 0, + "bps_wr_max": 0, + "iops_max": 0, + "iops_rd_max": 0, + "iops_wr_max": 0, + "iops_size": 0, + "detect_zeroes": "on", + "write_threshold": 0, + "image":{ + "filename":"disks/test.qcow2", + "format":"qcow2", + "virtual-size":2048000, + "backing_file":"base.qcow2", + "full-backing-filename":"disks/base.qcow2", + "backing-filename-format":"qcow2", + "snapshots":[ + { + "id": "1", + "name": "snapshot1", + "vm-state-size": 0, + "date-sec": 10000200, + "date-nsec": 12, + "vm-clock-sec": 206, + "vm-clock-nsec": 30 + } + ], + "backing-image":{ + "filename":"disks/base.qcow2", + "format":"qcow2", + "virtual-size":2048000 + } + } + }, + "type":"unknown" + }, + { + "io-status": "ok", + "device":"ide1-cd0", + "locked":false, + "removable":true, + "type":"unknown" + }, + { + "device":"floppy0", + "locked":false, + "removable":true, + "type":"unknown" + }, + { + "device":"sd0", + "locked":false, + "removable":true, + "type":"unknown" + } + ] + } + +query-blockstats +---------------- + +Show block device statistics. + +Each device statistic information is stored in a json-object and the returned +value is a json-array of all devices. + +Each json-object contain the following: + +- "device": device name (json-string) +- "stats": A json-object with the statistics information, it contains: + - "rd_bytes": bytes read (json-int) + - "wr_bytes": bytes written (json-int) + - "rd_operations": read operations (json-int) + - "wr_operations": write operations (json-int) + - "flush_operations": cache flush operations (json-int) + - "wr_total_time_ns": total time spend on writes in nano-seconds (json-int) + - "rd_total_time_ns": total time spend on reads in nano-seconds (json-int) + - "flush_total_time_ns": total time spend on cache flushes in nano-seconds (json-int) + - "wr_highest_offset": The offset after the greatest byte written to the + BlockDriverState since it has been opened (json-int) + - "rd_merged": number of read requests that have been merged into + another request (json-int) + - "wr_merged": number of write requests that have been merged into + another request (json-int) + - "idle_time_ns": time since the last I/O operation, in + nanoseconds. If the field is absent it means + that there haven't been any operations yet + (json-int, optional) + - "failed_rd_operations": number of failed read operations + (json-int) + - "failed_wr_operations": number of failed write operations + (json-int) + - "failed_flush_operations": number of failed flush operations + (json-int) + - "invalid_rd_operations": number of invalid read operations + (json-int) + - "invalid_wr_operations": number of invalid write operations + (json-int) + - "invalid_flush_operations": number of invalid flush operations + (json-int) + - "account_invalid": whether invalid operations are included in + the last access statistics (json-bool) + - "account_failed": whether failed operations are included in the + latency and last access statistics + (json-bool) + - "timed_stats": A json-array containing statistics collected in + specific intervals, with the following members: + - "interval_length": interval used for calculating the + statistics, in seconds (json-int) + - "min_rd_latency_ns": minimum latency of read operations in + the defined interval, in nanoseconds + (json-int) + - "min_wr_latency_ns": minimum latency of write operations in + the defined interval, in nanoseconds + (json-int) + - "min_flush_latency_ns": minimum latency of flush operations + in the defined interval, in + nanoseconds (json-int) + - "max_rd_latency_ns": maximum latency of read operations in + the defined interval, in nanoseconds + (json-int) + - "max_wr_latency_ns": maximum latency of write operations in + the defined interval, in nanoseconds + (json-int) + - "max_flush_latency_ns": maximum latency of flush operations + in the defined interval, in + nanoseconds (json-int) + - "avg_rd_latency_ns": average latency of read operations in + the defined interval, in nanoseconds + (json-int) + - "avg_wr_latency_ns": average latency of write operations in + the defined interval, in nanoseconds + (json-int) + - "avg_flush_latency_ns": average latency of flush operations + in the defined interval, in + nanoseconds (json-int) + - "avg_rd_queue_depth": average number of pending read + operations in the defined interval + (json-number) + - "avg_wr_queue_depth": average number of pending write + operations in the defined interval + (json-number). +- "parent": Contains recursively the statistics of the underlying + protocol (e.g. the host file for a qcow2 image). If there is + no underlying protocol, this field is omitted + (json-object, optional) + +Example: + +-> { "execute": "query-blockstats" } +<- { + "return":[ + { + "device":"ide0-hd0", + "parent":{ + "stats":{ + "wr_highest_offset":3686448128, + "wr_bytes":9786368, + "wr_operations":751, + "rd_bytes":122567168, + "rd_operations":36772 + "wr_total_times_ns":313253456 + "rd_total_times_ns":3465673657 + "flush_total_times_ns":49653 + "flush_operations":61, + "rd_merged":0, + "wr_merged":0, + "idle_time_ns":2953431879, + "account_invalid":true, + "account_failed":false + } + }, + "stats":{ + "wr_highest_offset":2821110784, + "wr_bytes":9786368, + "wr_operations":692, + "rd_bytes":122739200, + "rd_operations":36604 + "flush_operations":51, + "wr_total_times_ns":313253456 + "rd_total_times_ns":3465673657 + "flush_total_times_ns":49653, + "rd_merged":0, + "wr_merged":0, + "idle_time_ns":2953431879, + "account_invalid":true, + "account_failed":false + } + }, + { + "device":"ide1-cd0", + "stats":{ + "wr_highest_offset":0, + "wr_bytes":0, + "wr_operations":0, + "rd_bytes":0, + "rd_operations":0 + "flush_operations":0, + "wr_total_times_ns":0 + "rd_total_times_ns":0 + "flush_total_times_ns":0, + "rd_merged":0, + "wr_merged":0, + "account_invalid":false, + "account_failed":false + } + }, + { + "device":"floppy0", + "stats":{ + "wr_highest_offset":0, + "wr_bytes":0, + "wr_operations":0, + "rd_bytes":0, + "rd_operations":0 + "flush_operations":0, + "wr_total_times_ns":0 + "rd_total_times_ns":0 + "flush_total_times_ns":0, + "rd_merged":0, + "wr_merged":0, + "account_invalid":false, + "account_failed":false + } + }, + { + "device":"sd0", + "stats":{ + "wr_highest_offset":0, + "wr_bytes":0, + "wr_operations":0, + "rd_bytes":0, + "rd_operations":0 + "flush_operations":0, + "wr_total_times_ns":0 + "rd_total_times_ns":0 + "flush_total_times_ns":0, + "rd_merged":0, + "wr_merged":0, + "account_invalid":false, + "account_failed":false + } + } + ] + } + +query-cpus +---------- + +Show CPU information. + +Return a json-array. Each CPU is represented by a json-object, which contains: + +- "CPU": CPU index (json-int) +- "current": true if this is the current CPU, false otherwise (json-bool) +- "halted": true if the cpu is halted, false otherwise (json-bool) +- "qom_path": path to the CPU object in the QOM tree (json-str) +- "arch": architecture of the cpu, which determines what additional + keys will be present (json-str) +- Current program counter. The key's name depends on the architecture: + "pc": i386/x86_64 (json-int) + "nip": PPC (json-int) + "pc" and "npc": sparc (json-int) + "PC": mips (json-int) +- "thread_id": ID of the underlying host thread (json-int) + +Example: + +-> { "execute": "query-cpus" } +<- { + "return":[ + { + "CPU":0, + "current":true, + "halted":false, + "qom_path":"/machine/unattached/device[0]", + "arch":"x86", + "pc":3227107138, + "thread_id":3134 + }, + { + "CPU":1, + "current":false, + "halted":true, + "qom_path":"/machine/unattached/device[2]", + "arch":"x86", + "pc":7108165, + "thread_id":3135 + } + ] + } + +query-iothreads +--------------- + +Returns a list of information about each iothread. + +Note this list excludes the QEMU main loop thread, which is not declared +using the -object iothread command-line option. It is always the main thread +of the process. + +Return a json-array. Each iothread is represented by a json-object, which contains: + +- "id": name of iothread (json-str) +- "thread-id": ID of the underlying host thread (json-int) + +Example: + +-> { "execute": "query-iothreads" } +<- { + "return":[ + { + "id":"iothread0", + "thread-id":3134 + }, + { + "id":"iothread1", + "thread-id":3135 + } + ] + } + +query-pci +--------- + +PCI buses and devices information. + +The returned value is a json-array of all buses. Each bus is represented by +a json-object, which has a key with a json-array of all PCI devices attached +to it. Each device is represented by a json-object. + +The bus json-object contains the following: + +- "bus": bus number (json-int) +- "devices": a json-array of json-objects, each json-object represents a + PCI device + +The PCI device json-object contains the following: + +- "bus": identical to the parent's bus number (json-int) +- "slot": slot number (json-int) +- "function": function number (json-int) +- "class_info": a json-object containing: + - "desc": device class description (json-string, optional) + - "class": device class number (json-int) +- "id": a json-object containing: + - "device": device ID (json-int) + - "vendor": vendor ID (json-int) +- "irq": device's IRQ if assigned (json-int, optional) +- "qdev_id": qdev id string (json-string) +- "pci_bridge": It's a json-object, only present if this device is a + PCI bridge, contains: + - "bus": bus number (json-int) + - "secondary": secondary bus number (json-int) + - "subordinate": subordinate bus number (json-int) + - "io_range": I/O memory range information, a json-object with the + following members: + - "base": base address, in bytes (json-int) + - "limit": limit address, in bytes (json-int) + - "memory_range": memory range information, a json-object with the + following members: + - "base": base address, in bytes (json-int) + - "limit": limit address, in bytes (json-int) + - "prefetchable_range": Prefetchable memory range information, a + json-object with the following members: + - "base": base address, in bytes (json-int) + - "limit": limit address, in bytes (json-int) + - "devices": a json-array of PCI devices if there's any attached, each + each element is represented by a json-object, which contains + the same members of the 'PCI device json-object' described + above (optional) +- "regions": a json-array of json-objects, each json-object represents a + memory region of this device + +The memory range json-object contains the following: + +- "base": base memory address (json-int) +- "limit": limit value (json-int) + +The region json-object can be an I/O region or a memory region, an I/O region +json-object contains the following: + +- "type": "io" (json-string, fixed) +- "bar": BAR number (json-int) +- "address": memory address (json-int) +- "size": memory size (json-int) + +A memory region json-object contains the following: + +- "type": "memory" (json-string, fixed) +- "bar": BAR number (json-int) +- "address": memory address (json-int) +- "size": memory size (json-int) +- "mem_type_64": true or false (json-bool) +- "prefetch": true or false (json-bool) + +Example: + +-> { "execute": "query-pci" } +<- { + "return":[ + { + "bus":0, + "devices":[ + { + "bus":0, + "qdev_id":"", + "slot":0, + "class_info":{ + "class":1536, + "desc":"Host bridge" + }, + "id":{ + "device":32902, + "vendor":4663 + }, + "function":0, + "regions":[ + + ] + }, + { + "bus":0, + "qdev_id":"", + "slot":1, + "class_info":{ + "class":1537, + "desc":"ISA bridge" + }, + "id":{ + "device":32902, + "vendor":28672 + }, + "function":0, + "regions":[ + + ] + }, + { + "bus":0, + "qdev_id":"", + "slot":1, + "class_info":{ + "class":257, + "desc":"IDE controller" + }, + "id":{ + "device":32902, + "vendor":28688 + }, + "function":1, + "regions":[ + { + "bar":4, + "size":16, + "address":49152, + "type":"io" + } + ] + }, + { + "bus":0, + "qdev_id":"", + "slot":2, + "class_info":{ + "class":768, + "desc":"VGA controller" + }, + "id":{ + "device":4115, + "vendor":184 + }, + "function":0, + "regions":[ + { + "prefetch":true, + "mem_type_64":false, + "bar":0, + "size":33554432, + "address":4026531840, + "type":"memory" + }, + { + "prefetch":false, + "mem_type_64":false, + "bar":1, + "size":4096, + "address":4060086272, + "type":"memory" + }, + { + "prefetch":false, + "mem_type_64":false, + "bar":6, + "size":65536, + "address":-1, + "type":"memory" + } + ] + }, + { + "bus":0, + "qdev_id":"", + "irq":11, + "slot":4, + "class_info":{ + "class":1280, + "desc":"RAM controller" + }, + "id":{ + "device":6900, + "vendor":4098 + }, + "function":0, + "regions":[ + { + "bar":0, + "size":32, + "address":49280, + "type":"io" + } + ] + } + ] + } + ] + } + +Note: This example has been shortened as the real response is too long. + +query-kvm +--------- + +Show KVM information. + +Return a json-object with the following information: + +- "enabled": true if KVM support is enabled, false otherwise (json-bool) +- "present": true if QEMU has KVM support, false otherwise (json-bool) + +Example: + +-> { "execute": "query-kvm" } +<- { "return": { "enabled": true, "present": true } } + +query-status +------------ + +Return a json-object with the following information: + +- "running": true if the VM is running, or false if it is paused (json-bool) +- "singlestep": true if the VM is in single step mode, + false otherwise (json-bool) +- "status": one of the following values (json-string) + "debug" - QEMU is running on a debugger + "inmigrate" - guest is paused waiting for an incoming migration + "internal-error" - An internal error that prevents further guest + execution has occurred + "io-error" - the last IOP has failed and the device is configured + to pause on I/O errors + "paused" - guest has been paused via the 'stop' command + "postmigrate" - guest is paused following a successful 'migrate' + "prelaunch" - QEMU was started with -S and guest has not started + "finish-migrate" - guest is paused to finish the migration process + "restore-vm" - guest is paused to restore VM state + "running" - guest is actively running + "save-vm" - guest is paused to save the VM state + "shutdown" - guest is shut down (and -no-shutdown is in use) + "watchdog" - the watchdog action is configured to pause and + has been triggered + +Example: + +-> { "execute": "query-status" } +<- { "return": { "running": true, "singlestep": false, "status": "running" } } + +query-mice +---------- + +Show VM mice information. + +Each mouse is represented by a json-object, the returned value is a json-array +of all mice. + +The mouse json-object contains the following: + +- "name": mouse's name (json-string) +- "index": mouse's index (json-int) +- "current": true if this mouse is receiving events, false otherwise (json-bool) +- "absolute": true if the mouse generates absolute input events (json-bool) + +Example: + +-> { "execute": "query-mice" } +<- { + "return":[ + { + "name":"QEMU Microsoft Mouse", + "index":0, + "current":false, + "absolute":false + }, + { + "name":"QEMU PS/2 Mouse", + "index":1, + "current":true, + "absolute":true + } + ] + } + +query-vnc +--------- + +Show VNC server information. + +Return a json-object with server information. Connected clients are returned +as a json-array of json-objects. + +The main json-object contains the following: + +- "enabled": true or false (json-bool) +- "host": server's IP address (json-string) +- "family": address family (json-string) + - Possible values: "ipv4", "ipv6", "unix", "unknown" +- "service": server's port number (json-string) +- "auth": authentication method (json-string) + - Possible values: "invalid", "none", "ra2", "ra2ne", "sasl", "tight", + "tls", "ultra", "unknown", "vencrypt", "vencrypt", + "vencrypt+plain", "vencrypt+tls+none", + "vencrypt+tls+plain", "vencrypt+tls+sasl", + "vencrypt+tls+vnc", "vencrypt+x509+none", + "vencrypt+x509+plain", "vencrypt+x509+sasl", + "vencrypt+x509+vnc", "vnc" +- "clients": a json-array of all connected clients + +Clients are described by a json-object, each one contain the following: + +- "host": client's IP address (json-string) +- "family": address family (json-string) + - Possible values: "ipv4", "ipv6", "unix", "unknown" +- "service": client's port number (json-string) +- "x509_dname": TLS dname (json-string, optional) +- "sasl_username": SASL username (json-string, optional) + +Example: + +-> { "execute": "query-vnc" } +<- { + "return":{ + "enabled":true, + "host":"0.0.0.0", + "service":"50402", + "auth":"vnc", + "family":"ipv4", + "clients":[ + { + "host":"127.0.0.1", + "service":"50401", + "family":"ipv4" + } + ] + } + } + +query-spice +----------- + +Show SPICE server information. + +Return a json-object with server information. Connected clients are returned +as a json-array of json-objects. + +The main json-object contains the following: + +- "enabled": true or false (json-bool) +- "host": server's IP address (json-string) +- "port": server's port number (json-int, optional) +- "tls-port": server's port number (json-int, optional) +- "auth": authentication method (json-string) + - Possible values: "none", "spice" +- "channels": a json-array of all active channels clients + +Channels are described by a json-object, each one contain the following: + +- "host": client's IP address (json-string) +- "family": address family (json-string) + - Possible values: "ipv4", "ipv6", "unix", "unknown" +- "port": client's port number (json-string) +- "connection-id": spice connection id. All channels with the same id + belong to the same spice session (json-int) +- "channel-type": channel type. "1" is the main control channel, filter for + this one if you want track spice sessions only (json-int) +- "channel-id": channel id. Usually "0", might be different needed when + multiple channels of the same type exist, such as multiple + display channels in a multihead setup (json-int) +- "tls": whether the channel is encrypted (json-bool) + +Example: + +-> { "execute": "query-spice" } +<- { + "return": { + "enabled": true, + "auth": "spice", + "port": 5920, + "tls-port": 5921, + "host": "0.0.0.0", + "channels": [ + { + "port": "54924", + "family": "ipv4", + "channel-type": 1, + "connection-id": 1804289383, + "host": "127.0.0.1", + "channel-id": 0, + "tls": true + }, + { + "port": "36710", + "family": "ipv4", + "channel-type": 4, + "connection-id": 1804289383, + "host": "127.0.0.1", + "channel-id": 0, + "tls": false + }, + [ ... more channels follow ... ] + ] + } + } + +query-name +---------- + +Show VM name. + +Return a json-object with the following information: + +- "name": VM's name (json-string, optional) + +Example: + +-> { "execute": "query-name" } +<- { "return": { "name": "qemu-name" } } + +query-uuid +---------- + +Show VM UUID. + +Return a json-object with the following information: + +- "UUID": Universally Unique Identifier (json-string) + +Example: + +-> { "execute": "query-uuid" } +<- { "return": { "UUID": "550e8400-e29b-41d4-a716-446655440000" } } + +query-command-line-options +-------------------------- + +Show command line option schema. + +Return a json-array of command line option schema for all options (or for +the given option), returning an error if the given option doesn't exist. + +Each array entry contains the following: + +- "option": option name (json-string) +- "parameters": a json-array describes all parameters of the option: + - "name": parameter name (json-string) + - "type": parameter type (one of 'string', 'boolean', 'number', + or 'size') + - "help": human readable description of the parameter + (json-string, optional) + - "default": default value string for the parameter + (json-string, optional) + +Example: + +-> { "execute": "query-command-line-options", "arguments": { "option": "option-rom" } } +<- { "return": [ + { + "parameters": [ + { + "name": "romfile", + "type": "string" + }, + { + "name": "bootindex", + "type": "number" + } + ], + "option": "option-rom" + } + ] + } + +query-migrate +------------- + +Migration status. + +Return a json-object. If migration is active there will be another json-object +with RAM migration status and if block migration is active another one with +block migration status. + +The main json-object contains the following: + +- "status": migration status (json-string) + - Possible values: "setup", "active", "completed", "failed", "cancelled" +- "total-time": total amount of ms since migration started. If + migration has ended, it returns the total migration + time (json-int) +- "setup-time" amount of setup time in milliseconds _before_ the + iterations begin but _after_ the QMP command is issued. + This is designed to provide an accounting of any activities + (such as RDMA pinning) which may be expensive, but do not + actually occur during the iterative migration rounds + themselves. (json-int) +- "downtime": only present when migration has finished correctly + total amount in ms for downtime that happened (json-int) +- "expected-downtime": only present while migration is active + total amount in ms for downtime that was calculated on + the last bitmap round (json-int) +- "ram": only present if "status" is "active", it is a json-object with the + following RAM information: + - "transferred": amount transferred in bytes (json-int) + - "remaining": amount remaining to transfer in bytes (json-int) + - "total": total amount of memory in bytes (json-int) + - "duplicate": number of pages filled entirely with the same + byte (json-int) + These are sent over the wire much more efficiently. + - "skipped": number of skipped zero pages (json-int) + - "normal" : number of whole pages transferred. I.e. they + were not sent as duplicate or xbzrle pages (json-int) + - "normal-bytes" : number of bytes transferred in whole + pages. This is just normal pages times size of one page, + but this way upper levels don't need to care about page + size (json-int) + - "dirty-sync-count": times that dirty ram was synchronized (json-int) +- "disk": only present if "status" is "active" and it is a block migration, + it is a json-object with the following disk information: + - "transferred": amount transferred in bytes (json-int) + - "remaining": amount remaining to transfer in bytes json-int) + - "total": total disk size in bytes (json-int) +- "xbzrle-cache": only present if XBZRLE is active. + It is a json-object with the following XBZRLE information: + - "cache-size": XBZRLE cache size in bytes + - "bytes": number of bytes transferred for XBZRLE compressed pages + - "pages": number of XBZRLE compressed pages + - "cache-miss": number of XBRZRLE page cache misses + - "cache-miss-rate": rate of XBRZRLE page cache misses + - "overflow": number of times XBZRLE overflows. This means + that the XBZRLE encoding was bigger than just sent the + whole page, and then we sent the whole page instead (as as + normal page). + +Examples: + +1. Before the first migration + +-> { "execute": "query-migrate" } +<- { "return": {} } + +2. Migration is done and has succeeded + +-> { "execute": "query-migrate" } +<- { "return": { + "status": "completed", + "ram":{ + "transferred":123, + "remaining":123, + "total":246, + "total-time":12345, + "setup-time":12345, + "downtime":12345, + "duplicate":123, + "normal":123, + "normal-bytes":123456, + "dirty-sync-count":15 + } + } + } + +3. Migration is done and has failed + +-> { "execute": "query-migrate" } +<- { "return": { "status": "failed" } } + +4. Migration is being performed and is not a block migration: + +-> { "execute": "query-migrate" } +<- { + "return":{ + "status":"active", + "ram":{ + "transferred":123, + "remaining":123, + "total":246, + "total-time":12345, + "setup-time":12345, + "expected-downtime":12345, + "duplicate":123, + "normal":123, + "normal-bytes":123456, + "dirty-sync-count":15 + } + } + } + +5. Migration is being performed and is a block migration: + +-> { "execute": "query-migrate" } +<- { + "return":{ + "status":"active", + "ram":{ + "total":1057024, + "remaining":1053304, + "transferred":3720, + "total-time":12345, + "setup-time":12345, + "expected-downtime":12345, + "duplicate":123, + "normal":123, + "normal-bytes":123456, + "dirty-sync-count":15 + }, + "disk":{ + "total":20971520, + "remaining":20880384, + "transferred":91136 + } + } + } + +6. Migration is being performed and XBZRLE is active: + +-> { "execute": "query-migrate" } +<- { + "return":{ + "status":"active", + "capabilities" : [ { "capability": "xbzrle", "state" : true } ], + "ram":{ + "total":1057024, + "remaining":1053304, + "transferred":3720, + "total-time":12345, + "setup-time":12345, + "expected-downtime":12345, + "duplicate":10, + "normal":3333, + "normal-bytes":3412992, + "dirty-sync-count":15 + }, + "xbzrle-cache":{ + "cache-size":67108864, + "bytes":20971520, + "pages":2444343, + "cache-miss":2244, + "cache-miss-rate":0.123, + "overflow":34434 + } + } + } + +migrate-set-capabilities +------------------------ + +Enable/Disable migration capabilities + +- "xbzrle": XBZRLE support +- "rdma-pin-all": pin all pages when using RDMA during migration +- "auto-converge": throttle down guest to help convergence of migration +- "zero-blocks": compress zero blocks during block migration +- "compress": use multiple compression threads to accelerate live migration +- "events": generate events for each migration state change +- "postcopy-ram": postcopy mode for live migration +- "x-colo": COarse-Grain LOck Stepping (COLO) for Non-stop Service + +Arguments: + +Example: + +-> { "execute": "migrate-set-capabilities" , "arguments": + { "capabilities": [ { "capability": "xbzrle", "state": true } ] } } + +query-migrate-capabilities +-------------------------- + +Query current migration capabilities + +- "capabilities": migration capabilities state + - "xbzrle" : XBZRLE state (json-bool) + - "rdma-pin-all" : RDMA Pin Page state (json-bool) + - "auto-converge" : Auto Converge state (json-bool) + - "zero-blocks" : Zero Blocks state (json-bool) + - "compress": Multiple compression threads state (json-bool) + - "events": Migration state change event state (json-bool) + - "postcopy-ram": postcopy ram state (json-bool) + - "x-colo": COarse-Grain LOck Stepping for Non-stop Service (json-bool) + +Arguments: + +Example: + +-> { "execute": "query-migrate-capabilities" } +<- {"return": [ + {"state": false, "capability": "xbzrle"}, + {"state": false, "capability": "rdma-pin-all"}, + {"state": false, "capability": "auto-converge"}, + {"state": false, "capability": "zero-blocks"}, + {"state": false, "capability": "compress"}, + {"state": true, "capability": "events"}, + {"state": false, "capability": "postcopy-ram"}, + {"state": false, "capability": "x-colo"} + ]} + +migrate-set-parameters +---------------------- + +Set migration parameters + +- "compress-level": set compression level during migration (json-int) +- "compress-threads": set compression thread count for migration (json-int) +- "decompress-threads": set decompression thread count for migration (json-int) +- "cpu-throttle-initial": set initial percentage of time guest cpus are + throttled for auto-converge (json-int) +- "cpu-throttle-increment": set throttle increasing percentage for + auto-converge (json-int) +- "max-bandwidth": set maximum speed for migrations (in bytes/sec) (json-int) +- "downtime-limit": set maximum tolerated downtime (in milliseconds) for + migrations (json-int) +- "x-checkpoint-delay": set the delay time for periodic checkpoint (json-int) + +Arguments: + +Example: + +-> { "execute": "migrate-set-parameters" , "arguments": + { "compress-level": 1 } } + +query-migrate-parameters +------------------------ + +Query current migration parameters + +- "parameters": migration parameters value + - "compress-level" : compression level value (json-int) + - "compress-threads" : compression thread count value (json-int) + - "decompress-threads" : decompression thread count value (json-int) + - "cpu-throttle-initial" : initial percentage of time guest cpus are + throttled (json-int) + - "cpu-throttle-increment" : throttle increasing percentage for + auto-converge (json-int) + - "max-bandwidth" : maximium migration speed in bytes per second + (json-int) + - "downtime-limit" : maximum tolerated downtime of migration in + milliseconds (json-int) +Arguments: + +Example: + +-> { "execute": "query-migrate-parameters" } +<- { + "return": { + "decompress-threads": 2, + "cpu-throttle-increment": 10, + "compress-threads": 8, + "compress-level": 1, + "cpu-throttle-initial": 20, + "max-bandwidth": 33554432, + "downtime-limit": 300 + } + } + +query-balloon +------------- + +Show balloon information. + +Make an asynchronous request for balloon info. When the request completes a +json-object will be returned containing the following data: + +- "actual": current balloon value in bytes (json-int) + +Example: + +-> { "execute": "query-balloon" } +<- { + "return":{ + "actual":1073741824, + } + } + +query-tpm +--------- + +Return information about the TPM device. + +Arguments: None + +Example: + +-> { "execute": "query-tpm" } +<- { "return": + [ + { "model": "tpm-tis", + "options": + { "type": "passthrough", + "data": + { "cancel-path": "/sys/class/misc/tpm0/device/cancel", + "path": "/dev/tpm0" + } + }, + "id": "tpm0" + } + ] + } + +query-tpm-models +---------------- + +Return a list of supported TPM models. + +Arguments: None + +Example: + +-> { "execute": "query-tpm-models" } +<- { "return": [ "tpm-tis" ] } + +query-tpm-types +--------------- + +Return a list of supported TPM types. + +Arguments: None + +Example: + +-> { "execute": "query-tpm-types" } +<- { "return": [ "passthrough" ] } + +chardev-add +---------------- + +Add a chardev. + +Arguments: + +- "id": the chardev's ID, must be unique (json-string) +- "backend": chardev backend type + parameters + +Examples: + +-> { "execute" : "chardev-add", + "arguments" : { "id" : "foo", + "backend" : { "type" : "null", "data" : {} } } } +<- { "return": {} } + +-> { "execute" : "chardev-add", + "arguments" : { "id" : "bar", + "backend" : { "type" : "file", + "data" : { "out" : "/tmp/bar.log" } } } } +<- { "return": {} } + +-> { "execute" : "chardev-add", + "arguments" : { "id" : "baz", + "backend" : { "type" : "pty", "data" : {} } } } +<- { "return": { "pty" : "/dev/pty/42" } } + +chardev-remove +-------------- + +Remove a chardev. + +Arguments: + +- "id": the chardev's ID, must exist and not be in use (json-string) + +Example: + +-> { "execute": "chardev-remove", "arguments": { "id" : "foo" } } +<- { "return": {} } + +query-rx-filter +--------------- + +Show rx-filter information. + +Returns a json-array of rx-filter information for all NICs (or for the +given NIC), returning an error if the given NIC doesn't exist, or +given NIC doesn't support rx-filter querying, or given net client +isn't a NIC. + +The query will clear the event notification flag of each NIC, then qemu +will start to emit event to QMP monitor. + +Each array entry contains the following: + +- "name": net client name (json-string) +- "promiscuous": promiscuous mode is enabled (json-bool) +- "multicast": multicast receive state (one of 'normal', 'none', 'all') +- "unicast": unicast receive state (one of 'normal', 'none', 'all') +- "vlan": vlan receive state (one of 'normal', 'none', 'all') (Since 2.0) +- "broadcast-allowed": allow to receive broadcast (json-bool) +- "multicast-overflow": multicast table is overflowed (json-bool) +- "unicast-overflow": unicast table is overflowed (json-bool) +- "main-mac": main macaddr string (json-string) +- "vlan-table": a json-array of active vlan id +- "unicast-table": a json-array of unicast macaddr string +- "multicast-table": a json-array of multicast macaddr string + +Example: + +-> { "execute": "query-rx-filter", "arguments": { "name": "vnet0" } } +<- { "return": [ + { + "promiscuous": true, + "name": "vnet0", + "main-mac": "52:54:00:12:34:56", + "unicast": "normal", + "vlan": "normal", + "vlan-table": [ + 4, + 0 + ], + "unicast-table": [ + ], + "multicast": "normal", + "multicast-overflow": false, + "unicast-overflow": false, + "multicast-table": [ + "01:00:5e:00:00:01", + "33:33:00:00:00:01", + "33:33:ff:12:34:56" + ], + "broadcast-allowed": false + } + ] + } + +blockdev-add +------------ + +Add a block device. + +This command is still a work in progress. It doesn't support all +block drivers among other things. Stay away from it unless you want +to help with its development. + +For the arguments, see the QAPI schema documentation of BlockdevOptions. + +Example (1): + +-> { "execute": "blockdev-add", + "arguments": { "driver": "qcow2", + "file": { "driver": "file", + "filename": "test.qcow2" } } } +<- { "return": {} } + +Example (2): + +-> { "execute": "blockdev-add", + "arguments": { + "driver": "qcow2", + "node-name": "my_disk", + "discard": "unmap", + "cache": { + "direct": true, + "writeback": true + }, + "file": { + "driver": "file", + "filename": "/tmp/test.qcow2" + }, + "backing": { + "driver": "raw", + "file": { + "driver": "file", + "filename": "/dev/fdset/4" + } + } + } + } + +<- { "return": {} } + +x-blockdev-del +------------ +Since 2.5 + +Deletes a block device that has been added using blockdev-add. +The command will fail if the node is attached to a device or is +otherwise being used. + +This command is still a work in progress and is considered +experimental. Stay away from it unless you want to help with its +development. + +Arguments: + +- "node-name": Name of the graph node to delete (json-string) + +Example: + +-> { "execute": "blockdev-add", + "arguments": { + "driver": "qcow2", + "node-name": "node0", + "file": { + "driver": "file", + "filename": "test.qcow2" + } + } + } + +<- { "return": {} } + +-> { "execute": "x-blockdev-del", + "arguments": { "node-name": "node0" } + } +<- { "return": {} } + +blockdev-open-tray +------------------ + +Opens a block device's tray. If there is a block driver state tree inserted as a +medium, it will become inaccessible to the guest (but it will remain associated +to the block device, so closing the tray will make it accessible again). + +If the tray was already open before, this will be a no-op. + +Once the tray opens, a DEVICE_TRAY_MOVED event is emitted. There are cases in +which no such event will be generated, these include: +- if the guest has locked the tray, @force is false and the guest does not + respond to the eject request +- if the BlockBackend denoted by @device does not have a guest device attached + to it +- if the guest device does not have an actual tray and is empty, for instance + for floppy disk drives + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) +- "force": if false (the default), an eject request will be sent to the guest if + it has locked the tray (and the tray will not be opened immediately); + if true, the tray will be opened regardless of whether it is locked + (json-bool, optional) + +Example: + +-> { "execute": "blockdev-open-tray", + "arguments": { "id": "ide0-1-0" } } + +<- { "timestamp": { "seconds": 1418751016, + "microseconds": 716996 }, + "event": "DEVICE_TRAY_MOVED", + "data": { "device": "ide1-cd0", + "id": "ide0-1-0", + "tray-open": true } } + +<- { "return": {} } + +blockdev-close-tray +------------------- + +Closes a block device's tray. If there is a block driver state tree associated +with the block device (which is currently ejected), that tree will be loaded as +the medium. + +If the tray was already closed before, this will be a no-op. + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) + +Example: + +-> { "execute": "blockdev-close-tray", + "arguments": { "id": "ide0-1-0" } } + +<- { "timestamp": { "seconds": 1418751345, + "microseconds": 272147 }, + "event": "DEVICE_TRAY_MOVED", + "data": { "device": "ide1-cd0", + "id": "ide0-1-0", + "tray-open": false } } + +<- { "return": {} } + +x-blockdev-remove-medium +------------------------ + +Removes a medium (a block driver state tree) from a block device. That block +device's tray must currently be open (unless there is no attached guest device). + +If the tray is open and there is no medium inserted, this will be a no-op. + +This command is still a work in progress and is considered experimental. +Stay away from it unless you want to help with its development. + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) + +Example: + +-> { "execute": "x-blockdev-remove-medium", + "arguments": { "id": "ide0-1-0" } } + +<- { "error": { "class": "GenericError", + "desc": "Tray of device 'ide0-1-0' is not open" } } + +-> { "execute": "blockdev-open-tray", + "arguments": { "id": "ide0-1-0" } } + +<- { "timestamp": { "seconds": 1418751627, + "microseconds": 549958 }, + "event": "DEVICE_TRAY_MOVED", + "data": { "device": "ide1-cd0", + "id": "ide0-1-0", + "tray-open": true } } + +<- { "return": {} } + +-> { "execute": "x-blockdev-remove-medium", + "arguments": { "device": "ide0-1-0" } } + +<- { "return": {} } + +x-blockdev-insert-medium +------------------------ + +Inserts a medium (a block driver state tree) into a block device. That block +device's tray must currently be open (unless there is no attached guest device) +and there must be no medium inserted already. + +This command is still a work in progress and is considered experimental. +Stay away from it unless you want to help with its development. + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) +- "node-name": root node of the BDS tree to insert into the block device + +Example: + +-> { "execute": "blockdev-add", + "arguments": { { "node-name": "node0", + "driver": "raw", + "file": { "driver": "file", + "filename": "fedora.iso" } } } + +<- { "return": {} } + +-> { "execute": "x-blockdev-insert-medium", + "arguments": { "id": "ide0-1-0", + "node-name": "node0" } } + +<- { "return": {} } + +x-blockdev-change +----------------- + +Dynamically reconfigure the block driver state graph. It can be used +to add, remove, insert or replace a graph node. Currently only the +Quorum driver implements this feature to add or remove its child. This +is useful to fix a broken quorum child. + +If @node is specified, it will be inserted under @parent. @child +may not be specified in this case. If both @parent and @child are +specified but @node is not, @child will be detached from @parent. + +Arguments: +- "parent": the id or name of the parent node (json-string) +- "child": the name of a child under the given parent node (json-string, optional) +- "node": the name of the node that will be added (json-string, optional) + +Note: this command is experimental, and not a stable API. It doesn't +support all kinds of operations, all kinds of children, nor all block +drivers. + +Warning: The data in a new quorum child MUST be consistent with that of +the rest of the array. + +Example: + +Add a new node to a quorum +-> { "execute": "blockdev-add", + "arguments": { "driver": "raw", + "node-name": "new_node", + "file": { "driver": "file", + "filename": "test.raw" } } } +<- { "return": {} } +-> { "execute": "x-blockdev-change", + "arguments": { "parent": "disk1", + "node": "new_node" } } +<- { "return": {} } + +Delete a quorum's node +-> { "execute": "x-blockdev-change", + "arguments": { "parent": "disk1", + "child": "children.1" } } +<- { "return": {} } + +query-named-block-nodes +----------------------- + +Return a list of BlockDeviceInfo for all the named block driver nodes + +Example: + +-> { "execute": "query-named-block-nodes" } +<- { "return": [ { "ro":false, + "drv":"qcow2", + "encrypted":false, + "file":"disks/test.qcow2", + "node-name": "my-node", + "backing_file_depth":1, + "bps":1000000, + "bps_rd":0, + "bps_wr":0, + "iops":1000000, + "iops_rd":0, + "iops_wr":0, + "bps_max": 8000000, + "bps_rd_max": 0, + "bps_wr_max": 0, + "iops_max": 0, + "iops_rd_max": 0, + "iops_wr_max": 0, + "iops_size": 0, + "write_threshold": 0, + "image":{ + "filename":"disks/test.qcow2", + "format":"qcow2", + "virtual-size":2048000, + "backing_file":"base.qcow2", + "full-backing-filename":"disks/base.qcow2", + "backing-filename-format":"qcow2", + "snapshots":[ + { + "id": "1", + "name": "snapshot1", + "vm-state-size": 0, + "date-sec": 10000200, + "date-nsec": 12, + "vm-clock-sec": 206, + "vm-clock-nsec": 30 + } + ], + "backing-image":{ + "filename":"disks/base.qcow2", + "format":"qcow2", + "virtual-size":2048000 + } + } } ] } + +blockdev-change-medium +---------------------- + +Changes the medium inserted into a block device by ejecting the current medium +and loading a new image file which is inserted as the new medium. + +Arguments: + +- "device": block device name (deprecated, use @id instead) + (json-string, optional) +- "id": the name or QOM path of the guest device (json-string, optional) +- "filename": filename of the new image (json-string) +- "format": format of the new image (json-string, optional) +- "read-only-mode": new read-only mode (json-string, optional) + - Possible values: "retain" (default), "read-only", "read-write" + +Examples: + +1. Change a removable medium + +-> { "execute": "blockdev-change-medium", + "arguments": { "id": "ide0-1-0", + "filename": "/srv/images/Fedora-12-x86_64-DVD.iso", + "format": "raw" } } +<- { "return": {} } + +2. Load a read-only medium into a writable drive + +-> { "execute": "blockdev-change-medium", + "arguments": { "id": "floppyA", + "filename": "/srv/images/ro.img", + "format": "raw", + "read-only-mode": "retain" } } + +<- { "error": + { "class": "GenericError", + "desc": "Could not open '/srv/images/ro.img': Permission denied" } } + +-> { "execute": "blockdev-change-medium", + "arguments": { "id": "floppyA", + "filename": "/srv/images/ro.img", + "format": "raw", + "read-only-mode": "read-only" } } + +<- { "return": {} } + +query-memdev +------------ + +Show memory devices information. + + +Example (1): + +-> { "execute": "query-memdev" } +<- { "return": [ + { + "size": 536870912, + "merge": false, + "dump": true, + "prealloc": false, + "host-nodes": [0, 1], + "policy": "bind" + }, + { + "size": 536870912, + "merge": false, + "dump": true, + "prealloc": true, + "host-nodes": [2, 3], + "policy": "preferred" + } + ] + } + +query-memory-devices +-------------------- + +Return a list of memory devices. + +Example: +-> { "execute": "query-memory-devices" } +<- { "return": [ { "data": + { "addr": 5368709120, + "hotpluggable": true, + "hotplugged": true, + "id": "d1", + "memdev": "/objects/memX", + "node": 0, + "size": 1073741824, + "slot": 0}, + "type": "dimm" + } ] } + +query-acpi-ospm-status +---------------------- + +Return list of ACPIOSTInfo for devices that support status reporting +via ACPI _OST method. + +Example: +-> { "execute": "query-acpi-ospm-status" } +<- { "return": [ { "device": "d1", "slot": "0", "slot-type": "DIMM", "source": 1, "status": 0}, + { "slot": "1", "slot-type": "DIMM", "source": 0, "status": 0}, + { "slot": "2", "slot-type": "DIMM", "source": 0, "status": 0}, + { "slot": "3", "slot-type": "DIMM", "source": 0, "status": 0} + ]} + +rtc-reset-reinjection +--------------------- + +Reset the RTC interrupt reinjection backlog. + +Arguments: None. + +Example: + +-> { "execute": "rtc-reset-reinjection" } +<- { "return": {} } + +trace-event-get-state +--------------------- + +Query the state of events. + +Arguments: + +- "name": Event name pattern (json-string). +- "vcpu": The vCPU to query, any vCPU by default (json-int, optional). + +An event is returned if: +- its name matches the "name" pattern, and +- if "vcpu" is given, the event has the "vcpu" property. + +Therefore, if "vcpu" is given, the operation will only match per-vCPU events, +returning their state on the specified vCPU. Special case: if "name" is an exact +match, "vcpu" is given and the event does not have the "vcpu" property, an error +is returned. + +Example: + +-> { "execute": "trace-event-get-state", "arguments": { "name": "qemu_memalign" } } +<- { "return": [ { "name": "qemu_memalign", "state": "disabled" } ] } + +trace-event-set-state +--------------------- + +Set the state of events. + +Arguments: + +- "name": Event name pattern (json-string). +- "enable": Whether to enable or disable the event (json-bool). +- "ignore-unavailable": Whether to ignore errors for events that cannot be + changed (json-bool, optional). +- "vcpu": The vCPU to act upon, all vCPUs by default (json-int, optional). + +An event's state is modified if: +- its name matches the "name" pattern, and +- if "vcpu" is given, the event has the "vcpu" property. + +Therefore, if "vcpu" is given, the operation will only match per-vCPU events, +setting their state on the specified vCPU. Special case: if "name" is an exact +match, "vcpu" is given and the event does not have the "vcpu" property, an error +is returned. + +Example: + +-> { "execute": "trace-event-set-state", "arguments": { "name": "qemu_memalign", "enable": "true" } } +<- { "return": {} } + +input-send-event +---------------- + +Send input event to guest. + +Arguments: + +- "device": display device (json-string, optional) +- "head": display head (json-int, optional) +- "events": list of input events + +The consoles are visible in the qom tree, under +/backend/console[$index]. They have a device link and head property, so +it is possible to map which console belongs to which device and display. + +Example (1): + +Press left mouse button. + +-> { "execute": "input-send-event", + "arguments": { "device": "video0", + "events": [ { "type": "btn", + "data" : { "down": true, "button": "left" } } ] } } +<- { "return": {} } + +-> { "execute": "input-send-event", + "arguments": { "device": "video0", + "events": [ { "type": "btn", + "data" : { "down": false, "button": "left" } } ] } } +<- { "return": {} } + +Example (2): + +Press ctrl-alt-del. + +-> { "execute": "input-send-event", + "arguments": { "events": [ + { "type": "key", "data" : { "down": true, + "key": {"type": "qcode", "data": "ctrl" } } }, + { "type": "key", "data" : { "down": true, + "key": {"type": "qcode", "data": "alt" } } }, + { "type": "key", "data" : { "down": true, + "key": {"type": "qcode", "data": "delete" } } } ] } } +<- { "return": {} } + +Example (3): + +Move mouse pointer to absolute coordinates (20000, 400). + +-> { "execute": "input-send-event" , + "arguments": { "events": [ + { "type": "abs", "data" : { "axis": "x", "value" : 20000 } }, + { "type": "abs", "data" : { "axis": "y", "value" : 400 } } ] } } +<- { "return": {} } + +block-set-write-threshold +------------ + +Change the write threshold for a block drive. The threshold is an offset, +thus must be non-negative. Default is no write threshold. +Setting the threshold to zero disables it. + +Arguments: + +- "node-name": the node name in the block driver state graph (json-string) +- "write-threshold": the write threshold in bytes (json-int) + +Example: + +-> { "execute": "block-set-write-threshold", + "arguments": { "node-name": "mydev", + "write-threshold": 17179869184 } } +<- { "return": {} } + +Show rocker switch +------------------ + +Arguments: + +- "name": switch name + +Example: + +-> { "execute": "query-rocker", "arguments": { "name": "sw1" } } +<- { "return": {"name": "sw1", "ports": 2, "id": 1327446905938}} + +Show rocker switch ports +------------------------ + +Arguments: + +- "name": switch name + +Example: + +-> { "execute": "query-rocker-ports", "arguments": { "name": "sw1" } } +<- { "return": [ {"duplex": "full", "enabled": true, "name": "sw1.1", + "autoneg": "off", "link-up": true, "speed": 10000}, + {"duplex": "full", "enabled": true, "name": "sw1.2", + "autoneg": "off", "link-up": true, "speed": 10000} + ]} + +Show rocker switch OF-DPA flow tables +------------------------------------- + +Arguments: + +- "name": switch name +- "tbl-id": (optional) flow table ID + +Example: + +-> { "execute": "query-rocker-of-dpa-flows", "arguments": { "name": "sw1" } } +<- { "return": [ {"key": {"in-pport": 0, "priority": 1, "tbl-id": 0}, + "hits": 138, + "cookie": 0, + "action": {"goto-tbl": 10}, + "mask": {"in-pport": 4294901760} + }, + {...more...}, + ]} + +Show rocker OF-DPA group tables +------------------------------- + +Arguments: + +- "name": switch name +- "type": (optional) group type + +Example: + +-> { "execute": "query-rocker-of-dpa-groups", "arguments": { "name": "sw1" } } +<- { "return": [ {"type": 0, "out-pport": 2, "pport": 2, "vlan-id": 3841, + "pop-vlan": 1, "id": 251723778}, + {"type": 0, "out-pport": 0, "pport": 0, "vlan-id": 3841, + "pop-vlan": 1, "id": 251723776}, + {"type": 0, "out-pport": 1, "pport": 1, "vlan-id": 3840, + "pop-vlan": 1, "id": 251658241}, + {"type": 0, "out-pport": 0, "pport": 0, "vlan-id": 3840, + "pop-vlan": 1, "id": 251658240} + ]} + +query-gic-capabilities +--------------- + +Return a list of GICCapability objects, describing supported GIC +(Generic Interrupt Controller) versions. + +Arguments: None + +Example: + +-> { "execute": "query-gic-capabilities" } +<- { "return": [{ "version": 2, "emulated": true, "kernel": false }, + { "version": 3, "emulated": false, "kernel": true } ] } + +Show existing/possible CPUs +--------------------------- + +Arguments: None. + +Example for pseries machine type started with +-smp 2,cores=2,maxcpus=4 -cpu POWER8: + +-> { "execute": "query-hotpluggable-cpus" } +<- {"return": [ + { "props": { "core-id": 8 }, "type": "POWER8-spapr-cpu-core", + "vcpus-count": 1 }, + { "props": { "core-id": 0 }, "type": "POWER8-spapr-cpu-core", + "vcpus-count": 1, "qom-path": "/machine/unattached/device[0]"} + ]}' + +Example for pc machine type started with +-smp 1,maxcpus=2: + -> { "execute": "query-hotpluggable-cpus" } + <- {"return": [ + { + "type": "qemu64-x86_64-cpu", "vcpus-count": 1, + "props": {"core-id": 0, "socket-id": 1, "thread-id": 0} + }, + { + "qom-path": "/machine/unattached/device[0]", + "type": "qemu64-x86_64-cpu", "vcpus-count": 1, + "props": {"core-id": 0, "socket-id": 0, "thread-id": 0} + } + ]} diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt index 7967ec4c5a..e0a2365c63 100644 --- a/docs/qmp-events.txt +++ b/docs/qmp-events.txt @@ -65,7 +65,12 @@ Emitted when a disk I/O error occurs. Data: -- "device": device name (json-string) +- "device": device name. This is always present for compatibility + reasons, but it can be empty ("") if the image does not + have a device name associated. (json-string) +- "node-name": node name. Note that errors may be reported for the root node + that is directly attached to a guest device rather than for the + node where the error occurred. (json-string) - "operation": I/O operation (json-string, "read" or "write") - "action": action that has been taken, it's one of the following (json-string): "ignore": error has been ignored @@ -76,6 +81,7 @@ Example: { "event": "BLOCK_IO_ERROR", "data": { "device": "ide0-hd1", + "node-name": "#block212", "operation": "write", "action": "stop" }, "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } @@ -214,12 +220,16 @@ or by HMP/QMP commands. Data: -- "device": device name (json-string) +- "device": Block device name. This is always present for compatibility + reasons, but it can be empty ("") if the image does not have a + device name associated. (json-string) +- "id": The name or QOM path of the guest device (json-string) - "tray-open": true if the tray has been opened or false if it has been closed (json-bool) { "event": "DEVICE_TRAY_MOVED", "data": { "device": "ide1-cd0", + "id": "/machine/unattached/device[22]", "tray-open": true }, "timestamp": { "seconds": 1265044230, "microseconds": 450486 } } diff --git a/docs/rcu.txt b/docs/rcu.txt index 2f70954e82..c84e7f42b2 100644 --- a/docs/rcu.txt +++ b/docs/rcu.txt @@ -37,7 +37,7 @@ do not matter; as soon as all previous critical sections have finished, there cannot be any readers who hold references to the data structure, and these can now be safely reclaimed (e.g., freed or unref'ed). -Here is a picutre: +Here is a picture: thread 1 thread 2 thread 3 ------------------- ------------------------ ------------------- @@ -145,7 +145,7 @@ The core RCU API is small: and then read from there. RCU read-side critical sections must use atomic_rcu_read() to - read data, unless concurrent writes are presented by another + read data, unless concurrent writes are prevented by another synchronization mechanism. Furthermore, RCU read-side critical sections should traverse the diff --git a/docs/specs/acpi_nvdimm.txt b/docs/specs/acpi_nvdimm.txt index 0fdd251fc0..3f322e6f55 100644 --- a/docs/specs/acpi_nvdimm.txt +++ b/docs/specs/acpi_nvdimm.txt @@ -65,8 +65,8 @@ _FIT(Firmware Interface Table) The detailed definition of the structure can be found at ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT). -QEMU NVDIMM Implemention -======================== +QEMU NVDIMM Implementation +========================== QEMU uses 4 bytes IO Port starting from 0x0a18 and a RAM-based memory page for NVDIMM ACPI. @@ -80,8 +80,17 @@ Memory: emulates _DSM access and writes the output data to it. ACPI writes _DSM Input Data (based on the offset in the page): - [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle, 0 is reserved for NVDIMM - Root device. + [0x0 - 0x3]: 4 bytes, NVDIMM Device Handle. + + The handle is completely QEMU internal thing, the values in + range [1, 0xFFFF] indicate nvdimm device. Other values are + reserved for other purposes. + + Reserved handles: + 0 is reserved for nvdimm root device named NVDR. + 0x10000 is reserved for QEMU internal DSM function called on + the root device. + [0x4 - 0x7]: 4 bytes, Revision ID, that is the Arg1 of _DSM method. [0x8 - 0xB]: 4 bytes. Function Index, that is the Arg2 of _DSM method. [0xC - 0xFFF]: 4084 bytes, the Arg3 of _DSM method. @@ -127,6 +136,52 @@ _DSM process diagram: | result from the page | | | +--------------------------+ +--------------+ - _FIT implementation - ------------------- - TODO (will fill it when nvdimm hotplug is introduced) +NVDIMM hotplug +-------------- +ACPI BIOS GPE.4 handler is dedicated for notifying OS about nvdimm device +hot-add event. + +QEMU internal use only _DSM function +------------------------------------ +1) Read FIT + _FIT method uses _DSM method to fetch NFIT structures blob from QEMU + in 1 page sized increments which are then concatenated and returned + as _FIT method result. + + Input parameters: + Arg0 – UUID {set to 648B9CF2-CDA1-4312-8AD9-49C4AF32BD62} + Arg1 – Revision ID (set to 1) + Arg2 - Function Index, 0x1 + Arg3 - A package containing a buffer whose layout is as follows: + + +----------+--------+--------+-------------------------------------------+ + | Field | Length | Offset | Description | + +----------+--------+--------+-------------------------------------------+ + | offset | 4 | 0 | offset in QEMU's NFIT structures blob to | + | | | | read from | + +----------+--------+--------+-------------------------------------------+ + + Output layout in the dsm memory page: + +----------+--------+--------+-------------------------------------------+ + | Field | Length | Offset | Description | + +----------+--------+--------+-------------------------------------------+ + | length | 4 | 0 | length of entire returned data | + | | | | (including this header) | + +----------+-----------------+-------------------------------------------+ + | | | | return status codes | + | | | | 0x0 - success | + | | | | 0x100 - error caused by NFIT update while | + | status | 4 | 4 | read by _FIT wasn't completed, other | + | | | | codes follow Chapter 3 in DSM Spec Rev1 | + +----------+-----------------+-------------------------------------------+ + | fit data | Varies | 8 | contains FIT data, this field is present | + | | | | if status field is 0; | + +----------+--------+--------+-------------------------------------------+ + + The FIT offset is maintained by the OSPM itself, current offset plus + the size of the fit data returned by the function is the next offset + OSPM should read. When all FIT data has been read out, zero fit data + size is returned. + + If it returns status code 0x100, OSPM should restart to read FIT (read + from offset 0 again). diff --git a/docs/specs/edu.txt b/docs/specs/edu.txt index 7f8146780b..0876310809 100644 --- a/docs/specs/edu.txt +++ b/docs/specs/edu.txt @@ -52,7 +52,7 @@ size == 8 for the rest. 0x20 (RW) : status register, bitwise OR 0x01 -- computing factorial (RO) - 0x80 -- raise interrupt 0x01 after finishing factorial computation + 0x80 -- raise interrupt after finishing factorial computation 0x24 (RO) : interrupt status register It contains values which raised the interrupt (see interrupt raise @@ -87,6 +87,11 @@ An IRQ is generated when written to the interrupt raise register. The value appears in interrupt status register when the interrupt is raised and has to be written to the interrupt acknowledge register to lower it. +The device supports both INTx and MSI interrupt. By default, INTx is +used. Even if the driver disabled INTx and only uses MSI, it still +needs to update the acknowledge register at the end of the IRQ handler +routine. + DMA controller -------------- One has to specify, source, destination, size, and start the transfer. One diff --git a/docs/specs/ppc-spapr-hotplug.txt b/docs/specs/ppc-spapr-hotplug.txt index 631b0cadae..f57e2a09c6 100644 --- a/docs/specs/ppc-spapr-hotplug.txt +++ b/docs/specs/ppc-spapr-hotplug.txt @@ -233,12 +233,27 @@ tools by host-level management such as an HMC. This level of management is not applicable to PowerKVM, hence the reason for extending the notification framework to support hotplug events. -Note that these events are not yet formally part of the PAPR+ specification, -but support for this format has already been implemented in DR-related -guest tools such as powerpc-utils/librtas, as well as kernel patches that have -been submitted to handle in-kernel processing of memory/cpu-related hotplug -events[1], and is planned for formal inclusion is PAPR+ specification. The -hotplug-specific payload is QEMU implemented as follows (with all values +The format for these EPOW-signalled events is described below under +"hotplug/unplug event structure". Note that these events are not +formally part of the PAPR+ specification, and have been superseded by a +newer format, also described below under "hotplug/unplug event structure", +and so are now deemed a "legacy" format. The formats are similar, but the +"modern" format contains additional fields/flags, which are denoted for the +purposes of this documentation with "#ifdef GUEST_SUPPORTS_MODERN" guards. + +QEMU should assume support only for "legacy" fields/flags unless the guest +advertises support for the "modern" format via ibm,client-architecture-support +hcall by setting byte 5, bit 6 of it's ibm,architecture-vec-5 option vector +structure (as described by LoPAPR v11, B.6.2.3). As with "legacy" format events, +"modern" format events are surfaced to the guest via check-exception RTAS calls, +but use a dedicated event source to signal the guest. This event source is +advertised to the guest by the addition of a "hot-plug-events" node under +"/event-sources" node of the guest's device tree using the standard format +described in LoPAPR v11, B.6.12.1. + +== hotplug/unplug event structure == + +The hotplug-specific payload in QEMU is implemented as follows (with all values encoded in big-endian format): struct rtas_event_log_v6_hp { @@ -263,14 +278,23 @@ struct rtas_event_log_v6_hp { #define RTAS_LOG_V6_HP_ACTION_ADD 1 #define RTAS_LOG_V6_HP_ACTION_REMOVE 2 uint8_t hotplug_action; /* action (add/remove) */ -#define RTAS_LOG_V6_HP_ID_DRC_NAME 1 -#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2 -#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3 +#define RTAS_LOG_V6_HP_ID_DRC_NAME 1 +#define RTAS_LOG_V6_HP_ID_DRC_INDEX 2 +#define RTAS_LOG_V6_HP_ID_DRC_COUNT 3 +#ifdef GUEST_SUPPORTS_MODERN +#define RTAS_LOG_V6_HP_ID_DRC_COUNT_INDEXED 4 +#endif uint8_t hotplug_identifier; /* type of the resource identifier, * which serves as the discriminator * for the 'drc' union field below */ +#ifdef GUEST_SUPPORTS_MODERN + uint8_t capabilities; /* capability flags, currently unused + * by QEMU + */ +#else uint8_t reserved; +#endif union { uint32_t index; /* DRC index of resource to take action * on @@ -278,6 +302,19 @@ struct rtas_event_log_v6_hp { uint32_t count; /* number of DR resources to take * action on (guest chooses which) */ +#ifdef GUEST_SUPPORTS_MODERN + struct { + uint32_t count; /* number of DR resources to take + * action on + */ + uint32_t index; /* DRC index of first resource to take + * action on. guest will take action + * on DRC index <index> through + * DRC index <index + count - 1> in + * sequential order + */ + } count_indexed; +#endif char name[1]; /* string representing the name of the * DRC to take action on */ diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt index 7890d71698..d70bd83b13 100644 --- a/docs/specs/vhost-user.txt +++ b/docs/specs/vhost-user.txt @@ -123,22 +123,22 @@ The communication consists of master sending message requests and slave sending message replies. Most of the requests don't require replies. Here is a list of the ones that do: - * VHOST_GET_FEATURES - * VHOST_GET_PROTOCOL_FEATURES - * VHOST_GET_VRING_BASE - * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD) + * VHOST_USER_GET_FEATURES + * VHOST_USER_GET_PROTOCOL_FEATURES + * VHOST_USER_GET_VRING_BASE + * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD) [ Also see the section on REPLY_ACK protocol extension. ] There are several messages that the master sends with file descriptors passed in the ancillary data: - * VHOST_SET_MEM_TABLE - * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD) - * VHOST_SET_LOG_FD - * VHOST_SET_VRING_KICK - * VHOST_SET_VRING_CALL - * VHOST_SET_VRING_ERR + * VHOST_USER_SET_MEM_TABLE + * VHOST_USER_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD) + * VHOST_USER_SET_LOG_FD + * VHOST_USER_SET_VRING_KICK + * VHOST_USER_SET_VRING_CALL + * VHOST_USER_SET_VRING_ERR If Master is unable to send the full message or receives a wrong reply it will close the connection. An optional reconnection mechanism can be implemented. diff --git a/docs/tcg-exclusive.promela b/docs/tcg-exclusive.promela new file mode 100644 index 0000000000..c91cfca9f7 --- /dev/null +++ b/docs/tcg-exclusive.promela @@ -0,0 +1,225 @@ +/* + * This model describes the implementation of exclusive sections in + * cpus-common.c (start_exclusive, end_exclusive, cpu_exec_start, + * cpu_exec_end). + * + * Author: Paolo Bonzini <pbonzini@redhat.com> + * + * This file is in the public domain. If you really want a license, + * the WTFPL will do. + * + * To verify it: + * spin -a docs/tcg-exclusive.promela + * gcc pan.c -O2 + * ./a.out -a + * + * Tunable processor macros: N_CPUS, N_EXCLUSIVE, N_CYCLES, USE_MUTEX, + * TEST_EXPENSIVE. + */ + +// Define the missing parameters for the model +#ifndef N_CPUS +#define N_CPUS 2 +#warning defaulting to 2 CPU processes +#endif + +// the expensive test is not so expensive for <= 2 CPUs +// If the mutex is used, it's also cheap (300 MB / 4 seconds) for 3 CPUs +// For 3 CPUs and the lock-free option it needs 1.5 GB of RAM +#if N_CPUS <= 2 || (N_CPUS <= 3 && defined USE_MUTEX) +#define TEST_EXPENSIVE +#endif + +#ifndef N_EXCLUSIVE +# if !defined N_CYCLES || N_CYCLES <= 1 || defined TEST_EXPENSIVE +# define N_EXCLUSIVE 2 +# warning defaulting to 2 concurrent exclusive sections +# else +# define N_EXCLUSIVE 1 +# warning defaulting to 1 concurrent exclusive sections +# endif +#endif +#ifndef N_CYCLES +# if N_EXCLUSIVE <= 1 || defined TEST_EXPENSIVE +# define N_CYCLES 2 +# warning defaulting to 2 CPU cycles +# else +# define N_CYCLES 1 +# warning defaulting to 1 CPU cycles +# endif +#endif + + +// synchronization primitives. condition variables require a +// process-local "cond_t saved;" variable. + +#define mutex_t byte +#define MUTEX_LOCK(m) atomic { m == 0 -> m = 1 } +#define MUTEX_UNLOCK(m) m = 0 + +#define cond_t int +#define COND_WAIT(c, m) { \ + saved = c; \ + MUTEX_UNLOCK(m); \ + c != saved -> MUTEX_LOCK(m); \ + } +#define COND_BROADCAST(c) c++ + +// this is the logic from cpus-common.c + +mutex_t mutex; +cond_t exclusive_cond; +cond_t exclusive_resume; +byte pending_cpus; + +byte running[N_CPUS]; +byte has_waiter[N_CPUS]; + +#define exclusive_idle() \ + do \ + :: pending_cpus -> COND_WAIT(exclusive_resume, mutex); \ + :: else -> break; \ + od + +#define start_exclusive() \ + MUTEX_LOCK(mutex); \ + exclusive_idle(); \ + pending_cpus = 1; \ + \ + i = 0; \ + do \ + :: i < N_CPUS -> { \ + if \ + :: running[i] -> has_waiter[i] = 1; pending_cpus++; \ + :: else -> skip; \ + fi; \ + i++; \ + } \ + :: else -> break; \ + od; \ + \ + do \ + :: pending_cpus > 1 -> COND_WAIT(exclusive_cond, mutex); \ + :: else -> break; \ + od; \ + MUTEX_UNLOCK(mutex); + +#define end_exclusive() \ + MUTEX_LOCK(mutex); \ + pending_cpus = 0; \ + COND_BROADCAST(exclusive_resume); \ + MUTEX_UNLOCK(mutex); + +#ifdef USE_MUTEX +// Simple version using mutexes +#define cpu_exec_start(id) \ + MUTEX_LOCK(mutex); \ + exclusive_idle(); \ + running[id] = 1; \ + MUTEX_UNLOCK(mutex); + +#define cpu_exec_end(id) \ + MUTEX_LOCK(mutex); \ + running[id] = 0; \ + if \ + :: pending_cpus -> { \ + pending_cpus--; \ + if \ + :: pending_cpus == 1 -> COND_BROADCAST(exclusive_cond); \ + :: else -> skip; \ + fi; \ + } \ + :: else -> skip; \ + fi; \ + MUTEX_UNLOCK(mutex); +#else +// Wait-free fast path, only needs mutex when concurrent with +// an exclusive section +#define cpu_exec_start(id) \ + running[id] = 1; \ + if \ + :: pending_cpus -> { \ + MUTEX_LOCK(mutex); \ + if \ + :: !has_waiter[id] -> { \ + running[id] = 0; \ + exclusive_idle(); \ + running[id] = 1; \ + } \ + :: else -> skip; \ + fi; \ + MUTEX_UNLOCK(mutex); \ + } \ + :: else -> skip; \ + fi; + +#define cpu_exec_end(id) \ + running[id] = 0; \ + if \ + :: pending_cpus -> { \ + MUTEX_LOCK(mutex); \ + if \ + :: has_waiter[id] -> { \ + has_waiter[id] = 0; \ + pending_cpus--; \ + if \ + :: pending_cpus == 1 -> COND_BROADCAST(exclusive_cond); \ + :: else -> skip; \ + fi; \ + } \ + :: else -> skip; \ + fi; \ + MUTEX_UNLOCK(mutex); \ + } \ + :: else -> skip; \ + fi +#endif + +// Promela processes + +byte done_cpu; +byte in_cpu; +active[N_CPUS] proctype cpu() +{ + byte id = _pid % N_CPUS; + byte cycles = 0; + cond_t saved; + + do + :: cycles == N_CYCLES -> break; + :: else -> { + cycles++; + cpu_exec_start(id) + in_cpu++; + done_cpu++; + in_cpu--; + cpu_exec_end(id) + } + od; +} + +byte done_exclusive; +byte in_exclusive; +active[N_EXCLUSIVE] proctype exclusive() +{ + cond_t saved; + byte i; + + start_exclusive(); + in_exclusive = 1; + done_exclusive++; + in_exclusive = 0; + end_exclusive(); +} + +#define LIVENESS (done_cpu == N_CPUS * N_CYCLES && done_exclusive == N_EXCLUSIVE) +#define SAFETY !(in_exclusive && in_cpu) + +never { /* ! ([] SAFETY && <> [] LIVENESS) */ + do + // once the liveness property is satisfied, this is not executable + // and the never clause is not accepted + :: ! LIVENESS -> accept_liveness: skip + :: 1 -> assert(SAFETY) + od; +} diff --git a/docs/throttle.txt b/docs/throttle.txt index 26d4d5107f..cd4e109d39 100644 --- a/docs/throttle.txt +++ b/docs/throttle.txt @@ -235,7 +235,10 @@ consider the following values: - Water leaks from the bucket at a rate of 100 IOPS. - Water can be added to the bucket at a rate of 2000 IOPS. - The size of the bucket is 2000 x 60 = 120000 - - If 'iops-total-max' is unset then the bucket size is 100 x 60. + - If 'iops-total-max-length' is unset then it defaults to 1 and the + size of the bucket is 2000. + - If 'iops-total-max' is unset then 'iops-total-max-length' must be + unset as well. In this case the bucket size is 100. The bucket is initially empty, therefore water can be added until it's full at a rate of 2000 IOPS (the burst rate). Once the bucket is full diff --git a/docs/tracing.txt b/docs/tracing.txt index 29f2f9a24d..f351998a4e 100644 --- a/docs/tracing.txt +++ b/docs/tracing.txt @@ -150,13 +150,16 @@ The trace backends are chosen at configure time: For a list of supported trace backends, try ./configure --help or see below. If multiple backends are enabled, the trace is sent to them all. +If no backends are explicitly selected, configure will default to the +"log" backend. + The following subsections describe the supported trace backends. === Nop === The "nop" backend generates empty trace event functions so that the compiler -can optimize out trace events completely. This is the default and imposes no -performance penalty. +can optimize out trace events completely. This imposes no performance +penalty. Note that regardless of the selected trace backend, events with the "disable" property will be generated with the "nop" backend. @@ -192,6 +195,18 @@ After running qemu by root user, you can get the trace: Restriction: "ftrace" backend is restricted to Linux only. +=== Syslog === + +The "syslog" backend sends trace events using the POSIX syslog API. The log +is opened specifying the LOG_DAEMON facility and LOG_PID option (so events +are tagged with the pid of the particular QEMU process that generated +them). All events are logged at LOG_INFO level. + +NOTE: syslog may squash duplicate consecutive trace events and apply rate + limiting. + +Restriction: "syslog" backend is restricted to POSIX compliant OS. + ==== Monitor commands ==== * trace-file on|off|flush|set <path> diff --git a/docs/writing-qmp-commands.txt b/docs/writing-qmp-commands.txt index 59aa77ae25..44c14db418 100644 --- a/docs/writing-qmp-commands.txt +++ b/docs/writing-qmp-commands.txt @@ -7,8 +7,8 @@ This document doesn't discuss QMP protocol level details, nor does it dive into the QAPI framework implementation. For an in-depth introduction to the QAPI framework, please refer to -docs/qapi-code-gen.txt. For documentation about the QMP protocol, please -check the files in QMP/. +docs/qapi-code-gen.txt. For documentation about the QMP protocol, +start with docs/qmp-intro.txt. == Overview == @@ -119,17 +119,6 @@ There are a few things to be noticed: 5. Printing to the terminal is discouraged for QMP commands, we do it here because it's the easiest way to demonstrate a QMP command -Now a little hack is needed. As we're still using the old QMP server we need -to add the new command to its internal dispatch table. This step won't be -required in the near future. Open the qmp-commands.hx file and add the -following at the bottom: - - { - .name = "hello-world", - .args_type = "", - .mhandler.cmd_new = qmp_marshal_hello_world, - }, - You're done. Now build qemu, run it as suggested in the "Testing" section, and then type the following QMP command: @@ -174,21 +163,6 @@ There are two important details to be noticed: 2. The C implementation signature must follow the schema's argument ordering, which is defined by the "data" member -The last step is to update the qmp-commands.hx file: - - { - .name = "hello-world", - .args_type = "message:s?", - .mhandler.cmd_new = qmp_marshal_hello_world, - }, - -Notice that the "args_type" member got our "message" argument. The character -"s" stands for "string" and "?" means it's optional. This too must be ordered -according to the C implementation and schema file. You can look for more -examples in the qmp-commands.hx file if you need to define more arguments. - -Again, this step won't be required in the future. - Time to test our new version of the "hello-world" command. Build qemu, run it as described in the "Testing" section and then send two commands: @@ -337,7 +311,7 @@ we should add it to the hmp-commands.hx file: .args_type = "message:s?", .params = "hello-world [message]", .help = "Print message to the standard output", - .mhandler.cmd = hmp_hello_world, + .cmd = hmp_hello_world, }, STEXI @@ -454,14 +428,6 @@ There are a number of things to be noticed: 6. You have to include the "qmp-commands.h" header file in qemu-timer.c, otherwise qemu won't build -The last step is to add the correspoding entry in the qmp-commands.hx file: - - { - .name = "query-alarm-clock", - .args_type = "", - .mhandler.cmd_new = qmp_marshal_query_alarm_clock, - }, - Time to test the new command. Build qemu, run it as described in the "Testing" section and try this: @@ -518,7 +484,7 @@ in the monitor.c file. The entry for the "info alarmclock" follows: .args_type = "", .params = "", .help = "show information about the alarm clock", - .mhandler.info = hmp_info_alarm_clock, + .cmd = hmp_info_alarm_clock, }, To test this, run qemu and type "info alarmclock" in the user monitor. @@ -600,14 +566,6 @@ iteration of the loop. That's because the alarm timer method in use is the first element of the alarm_timers array. Also notice that QAPI lists are handled by hand and we return the head of the list. -To test this you have to add the corresponding qmp-commands.hx entry: - - { - .name = "query-alarm-methods", - .args_type = "", - .mhandler.cmd_new = qmp_marshal_query_alarm_methods, - }, - Now Build qemu, run it as explained in the "Testing" section and try our new command: diff --git a/docs/xbzrle.txt b/docs/xbzrle.txt index 52c8511a4c..c0a7dfd44c 100644 --- a/docs/xbzrle.txt +++ b/docs/xbzrle.txt @@ -42,7 +42,7 @@ nzrun = length byte... length = uleb128 encoded integer On the sender side XBZRLE is used as a compact delta encoding of page updates, -retrieving the old page content from the cache (default size of 512 MB). The +retrieving the old page content from the cache (default size of 64MB). The receiving side uses the existing page's content and XBZRLE to decode the new page's content. @@ -73,7 +73,7 @@ e9 07 0f 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 03 01 67 01 01 69 Cache update strategy ===================== -Keeping the hot pages in the cache is effective for decreased cache +Keeping the hot pages in the cache is effective for decreasing cache misses. XBZRLE uses a counter as the age of each page. The counter will increase after each ram dirty bitmap sync. When a cache conflict is detected, XBZRLE will only evict pages in the cache that are older than diff --git a/docs/xen-save-devices-state.txt b/docs/xen-save-devices-state.txt index 92e08dbf6a..a72ecc8081 100644 --- a/docs/xen-save-devices-state.txt +++ b/docs/xen-save-devices-state.txt @@ -9,7 +9,7 @@ however it is also possible to save the state of all devices to file, without saving the RAM or the block devices of the VM. This operation is called "xen-save-devices-state" (see -QMP/qmp-commands.txt) +qmp-commands.txt) The binary format used in the file is the following: |