summaryrefslogtreecommitdiff
path: root/docs/specs
diff options
context:
space:
mode:
Diffstat (limited to 'docs/specs')
-rw-r--r--docs/specs/fw_cfg.txt94
-rw-r--r--docs/specs/ivshmem_device_spec.txt127
-rw-r--r--docs/specs/ppc-spapr-hcalls.txt4
-rw-r--r--docs/specs/ppc-spapr-hotplug.txt48
-rw-r--r--docs/specs/qcow2.txt2
-rw-r--r--docs/specs/rocker.txt2
-rw-r--r--docs/specs/vhost-user.txt208
7 files changed, 442 insertions, 43 deletions
diff --git a/docs/specs/fw_cfg.txt b/docs/specs/fw_cfg.txt
index 74351dd18..b8c794f54 100644
--- a/docs/specs/fw_cfg.txt
+++ b/docs/specs/fw_cfg.txt
@@ -76,6 +76,13 @@ increasing address order, similar to memcpy().
Selector Register IOport: 0x510
Data Register IOport: 0x511
+DMA Address IOport: 0x514
+
+=== ARM Register Locations ===
+
+Selector Register address: Base + 8 (2 bytes)
+Data Register address: Base + 0 (8 bytes)
+DMA Address address: Base + 16 (8 bytes)
== Firmware Configuration Items ==
@@ -86,11 +93,15 @@ by selecting the "signature" item using key 0x0000 (FW_CFG_SIGNATURE),
and reading four bytes from the data register. If the fw_cfg device is
present, the four bytes read will contain the characters "QEMU".
-=== Revision (Key 0x0001, FW_CFG_ID) ===
+If the DMA interface is available, then reading the DMA Address
+Register returns 0x51454d5520434647 ("QEMU CFG" in big-endian format).
+
+=== Revision / feature bitmap (Key 0x0001, FW_CFG_ID) ===
-A 32-bit little-endian unsigned int, this item is used as an interface
-revision number, and is currently set to 1 by QEMU when fw_cfg is
-initialized.
+A 32-bit little-endian unsigned int, this item is used to check for enabled
+features.
+ - Bit 0: traditional interface. Always set.
+ - Bit 1: DMA interface.
=== File Directory (Key 0x0019, FW_CFG_FILE_DIR) ===
@@ -132,6 +143,55 @@ Selector Reg. Range Usage
In practice, the number of allowed firmware configuration items is given
by the value of FW_CFG_MAX_ENTRY (see fw_cfg.h).
+= Guest-side DMA Interface =
+
+If bit 1 of the feature bitmap is set, the DMA interface is present. This does
+not replace the existing fw_cfg interface, it is an add-on. This interface
+can be used through the 64-bit wide address register.
+
+The address register is in big-endian format. The value for the register is 0
+at startup and after an operation. A write to the least significant half (at
+offset 4) triggers an operation. This means that operations with 32-bit
+addresses can be triggered with just one write, whereas operations with
+64-bit addresses can be triggered with one 64-bit write or two 32-bit writes,
+starting with the most significant half (at offset 0).
+
+In this register, the physical address of a FWCfgDmaAccess structure in RAM
+should be written. This is the format of the FWCfgDmaAccess structure:
+
+typedef struct FWCfgDmaAccess {
+ uint32_t control;
+ uint32_t length;
+ uint64_t address;
+} FWCfgDmaAccess;
+
+The fields of the structure are in big endian mode, and the field at the lowest
+address is the "control" field.
+
+The "control" field has the following bits:
+ - Bit 0: Error
+ - Bit 1: Read
+ - Bit 2: Skip
+ - Bit 3: Select. The upper 16 bits are the selected index.
+
+When an operation is triggered, if the "control" field has bit 3 set, the
+upper 16 bits are interpreted as an index of a firmware configuration item.
+This has the same effect as writing the selector register.
+
+If the "control" field has bit 1 set, a read operation will be performed.
+"length" bytes for the current selector and offset will be copied into the
+physical RAM address specified by the "address" field.
+
+If the "control" field has bit 2 set (and not bit 1), a skip operation will be
+performed. The offset for the current selector will be advanced "length" bytes.
+
+To check the result, read the "control" field:
+ error bit set -> something went wrong.
+ all bits cleared -> transfer finished successfully.
+ otherwise -> transfer still in progress (doesn't happen
+ today due to implementation not being async,
+ but may in the future).
+
= Host-side API =
The following functions are available to the QEMU programmer for adding
@@ -159,6 +219,17 @@ will convert a 16-, 32-, or 64-bit integer to little-endian, then add
a dynamically allocated copy of the appropriately sized item to fw_cfg
under the given selector key value.
+== fw_cfg_modify_iXX() ==
+
+Modify the value of an XX-bit item (where XX may be 16, 32, or 64).
+Similarly to the corresponding fw_cfg_add_iXX() function set, convert
+a 16-, 32-, or 64-bit integer to little endian, create a dynamically
+allocated copy of the required size, and replace the existing item at
+the given selector key value with the newly allocated one. The previous
+item, assumed to have been allocated during an earlier call to
+fw_cfg_add_iXX() or fw_cfg_modify_iXX() (of the same width XX), is freed
+before the function returns.
+
== fw_cfg_add_file() ==
Given a filename (i.e., fw_cfg item name), starting pointer, and size,
@@ -216,6 +287,21 @@ the following syntax:
where <item_name> is the fw_cfg item name, and <path> is the location
on the host file system of a file containing the data to be inserted.
+Small enough items may be provided directly as strings on the command
+line, using the syntax:
+
+ -fw_cfg [name=]<item_name>,string=<string>
+
+The terminating NUL character of the content <string> will NOT be
+included as part of the fw_cfg item data, which is consistent with
+the absence of a NUL terminator for items inserted via the file option.
+
+Both <item_name> and, if applicable, the content <string> are passed
+through by QEMU without any interpretation, expansion, or further
+processing. Any such processing (potentially performed e.g., by the shell)
+is outside of QEMU's responsibility; as such, using plain ASCII characters
+is recommended.
+
NOTE: Users *SHOULD* choose item names beginning with the prefix "opt/"
when using the "-fw_cfg" command line option, to avoid conflicting with
item names used internally by QEMU. For instance:
diff --git a/docs/specs/ivshmem_device_spec.txt b/docs/specs/ivshmem_device_spec.txt
index 667a8628f..d318d65c3 100644
--- a/docs/specs/ivshmem_device_spec.txt
+++ b/docs/specs/ivshmem_device_spec.txt
@@ -2,30 +2,106 @@
Device Specification for Inter-VM shared memory device
------------------------------------------------------
-The Inter-VM shared memory device is designed to share a region of memory to
-userspace in multiple virtual guests. The memory region does not belong to any
-guest, but is a POSIX memory object on the host. Optionally, the device may
-support sending interrupts to other guests sharing the same memory region.
+The Inter-VM shared memory device is designed to share a memory region (created
+on the host via the POSIX shared memory API) between multiple QEMU processes
+running different guests. In order for all guests to be able to pick up the
+shared memory area, it is modeled by QEMU as a PCI device exposing said memory
+to the guest as a PCI BAR.
+The memory region does not belong to any guest, but is a POSIX memory object on
+the host. The host can access this shared memory if needed.
+
+The device also provides an optional communication mechanism between guests
+sharing the same memory object. More details about that in the section 'Guest to
+guest communication' section.
The Inter-VM PCI device
-----------------------
-*BARs*
+From the VM point of view, the ivshmem PCI device supports three BARs.
+
+- BAR0 is a 1 Kbyte MMIO region to support registers and interrupts when MSI is
+ not used.
+- BAR1 is used for MSI-X when it is enabled in the device.
+- BAR2 is used to access the shared memory object.
+
+It is your choice how to use the device but you must choose between two
+behaviors :
+
+- basically, if you only need the shared memory part, you will map BAR2.
+ This way, you have access to the shared memory in guest and can use it as you
+ see fit (memnic, for example, uses it in userland
+ http://dpdk.org/browse/memnic).
+
+- BAR0 and BAR1 are used to implement an optional communication mechanism
+ through interrupts in the guests. If you need an event mechanism between the
+ guests accessing the shared memory, you will most likely want to write a
+ kernel driver that will handle interrupts. See details in the section 'Guest
+ to guest communication' section.
+
+The behavior is chosen when starting your QEMU processes:
+- no communication mechanism needed, the first QEMU to start creates the shared
+ memory on the host, subsequent QEMU processes will use it.
+
+- communication mechanism needed, an ivshmem server must be started before any
+ QEMU processes, then each QEMU process connects to the server unix socket.
+
+For more details on the QEMU ivshmem parameters, see qemu-doc documentation.
+
+
+Guest to guest communication
+----------------------------
+
+This section details the communication mechanism between the guests accessing
+the ivhsmem shared memory.
-The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support
-registers. BAR1 is used for MSI-X when it is enabled in the device. BAR2 is
-used to map the shared memory object from the host. The size of BAR2 is
-specified when the guest is started and must be a power of 2 in size.
+*ivshmem server*
-*Registers*
+This server code is available in qemu.git/contrib/ivshmem-server.
-The device currently supports 4 registers of 32-bits each. Registers
-are used for synchronization between guests sharing the same memory object when
-interrupts are supported (this requires using the shared memory server).
+The server must be started on the host before any guest.
+It creates a shared memory object then waits for clients to connect on a unix
+socket. All the messages are little-endian int64_t integer.
-The server assigns each VM an ID number and sends this ID number to the QEMU
-process when the guest starts.
+For each client (QEMU process) that connects to the server:
+- the server sends a protocol version, if client does not support it, the client
+ closes the communication,
+- the server assigns an ID for this client and sends this ID to him as the first
+ message,
+- the server sends a fd to the shared memory object to this client,
+- the server creates a new set of host eventfds associated to the new client and
+ sends this set to all already connected clients,
+- finally, the server sends all the eventfds sets for all clients to the new
+ client.
+
+The server signals all clients when one of them disconnects.
+
+The client IDs are limited to 16 bits because of the current implementation (see
+Doorbell register in 'PCI device registers' subsection). Hence only 65536
+clients are supported.
+
+All the file descriptors (fd to the shared memory, eventfds for each client)
+are passed to clients using SCM_RIGHTS over the server unix socket.
+
+Apart from the current ivshmem implementation in QEMU, an ivshmem client has
+been provided in qemu.git/contrib/ivshmem-client for debug.
+
+*QEMU as an ivshmem client*
+
+At initialisation, when creating the ivshmem device, QEMU first receives a
+protocol version and closes communication with server if it does not match.
+Then, QEMU gets its ID from the server then makes it available through BAR0
+IVPosition register for the VM to use (see 'PCI device registers' subsection).
+QEMU then uses the fd to the shared memory to map it to BAR2.
+eventfds for all other clients received from the server are stored to implement
+BAR0 Doorbell register (see 'PCI device registers' subsection).
+Finally, eventfds assigned to this QEMU process are used to send interrupts in
+this VM.
+
+*PCI device registers*
+
+From the VM point of view, the ivshmem PCI device supports 4 registers of
+32-bits each.
enum ivshmem_registers {
IntrMask = 0,
@@ -49,8 +125,8 @@ bit to 0 and unmasked by setting the first bit to 1.
IVPosition Register: The IVPosition register is read-only and reports the
guest's ID number. The guest IDs are non-negative integers. When using the
server, since the server is a separate process, the VM ID will only be set when
-the device is ready (shared memory is received from the server and accessible via
-the device). If the device is not ready, the IVPosition will return -1.
+the device is ready (shared memory is received from the server and accessible
+via the device). If the device is not ready, the IVPosition will return -1.
Applications should ensure that they have a valid VM ID before accessing the
shared memory.
@@ -59,8 +135,8 @@ Doorbell register. The doorbell register is 32-bits, logically divided into
two 16-bit fields. The high 16-bits are the guest ID to interrupt and the low
16-bits are the interrupt vector to trigger. The semantics of the value
written to the doorbell depends on whether the device is using MSI or a regular
-pin-based interrupt. In short, MSI uses vectors while regular interrupts set the
-status register.
+pin-based interrupt. In short, MSI uses vectors while regular interrupts set
+the status register.
Regular Interrupts
@@ -71,7 +147,7 @@ interrupt in the destination guest.
Message Signalled Interrupts
-A ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
+An ivshmem device may support multiple MSI vectors. If so, the lower 16-bits
written to the Doorbell register must be between 0 and the maximum number of
vectors the guest supports. The lower 16 bits written to the doorbell is the
MSI vector that will be raised in the destination guest. The number of MSI
@@ -83,14 +159,3 @@ interrupt itself should be communicated via the shared memory region. Devices
supporting multiple MSI vectors can use different vectors to indicate different
events have occurred. The semantics of interrupt vectors are left to the
user's discretion.
-
-
-Usage in the Guest
-------------------
-
-The shared memory device is intended to be used with the provided UIO driver.
-Very little configuration is needed. The guest should map BAR0 to access the
-registers (an array of 32-bit ints allows simple writing) and map BAR2 to
-access the shared memory region itself. The size of the shared memory region
-is specified when the guest (or shared memory server) is started. A guest may
-map the whole shared memory region or only part of it.
diff --git a/docs/specs/ppc-spapr-hcalls.txt b/docs/specs/ppc-spapr-hcalls.txt
index 667b3fa00..5bd8eab78 100644
--- a/docs/specs/ppc-spapr-hcalls.txt
+++ b/docs/specs/ppc-spapr-hcalls.txt
@@ -41,8 +41,8 @@ When the guest runs in "real mode" (in powerpc lingua this means
with MMU disabled, ie guest effective == guest physical), it only
has access to a subset of memory and no IOs.
-PAPR provides a set of hypervisor calls to perform cachable or
-non-cachable accesses to any guest physical addresses that the
+PAPR provides a set of hypervisor calls to perform cacheable or
+non-cacheable accesses to any guest physical addresses that the
guest can use in order to access IO devices while in real mode.
This is typically used by the firmware running in the guest.
diff --git a/docs/specs/ppc-spapr-hotplug.txt b/docs/specs/ppc-spapr-hotplug.txt
index 46e07196b..631b0cada 100644
--- a/docs/specs/ppc-spapr-hotplug.txt
+++ b/docs/specs/ppc-spapr-hotplug.txt
@@ -302,4 +302,52 @@ consisting of <phys>, <size> and <maxcpus>.
pseries guests use this property to note the maximum allowed CPUs for the
guest.
+== ibm,dynamic-reconfiguration-memory ==
+
+ibm,dynamic-reconfiguration-memory is a device tree node that represents
+dynamically reconfigurable logical memory blocks (LMB). This node
+is generated only when the guest advertises the support for it via
+ibm,client-architecture-support call. Memory that is not dynamically
+reconfigurable is represented by /memory nodes. The properties of this
+node that are of interest to the sPAPR memory hotplug implementation
+in QEMU are described here.
+
+ibm,lmb-size
+
+This 64bit integer defines the size of each dynamically reconfigurable LMB.
+
+ibm,associativity-lookup-arrays
+
+This property defines a lookup array in which the NUMA associativity
+information for each LMB can be found. It is a property encoded array
+that begins with an integer M, the number of associativity lists followed
+by an integer N, the number of entries per associativity list and terminated
+by M associativity lists each of length N integers.
+
+This property provides the same information as given by ibm,associativity
+property in a /memory node. Each assigned LMB has an index value between
+0 and M-1 which is used as an index into this table to select which
+associativity list to use for the LMB. This index value for each LMB
+is defined in ibm,dynamic-memory property.
+
+ibm,dynamic-memory
+
+This property describes the dynamically reconfigurable memory. It is a
+property encoded array that has an integer N, the number of LMBs followed
+by N LMB list entires.
+
+Each LMB list entry consists of the following elements:
+
+- Logical address of the start of the LMB encoded as a 64bit integer. This
+ corresponds to reg property in /memory node.
+- DRC index of the LMB that corresponds to ibm,my-drc-index property
+ in a /memory node.
+- Four bytes reserved for expansion.
+- Associativity list index for the LMB that is used as an index into
+ ibm,associativity-lookup-arrays property described earlier. This
+ is used to retrieve the right associativity list to be used for this
+ LMB.
+- A 32bit flags word. The bit at bit position 0x00000008 defines whether
+ the LMB is assigned to the the partition as of boot time.
+
[1] http://thread.gmane.org/gmane.linux.ports.ppc.embedded/75350/focus=106867
diff --git a/docs/specs/qcow2.txt b/docs/specs/qcow2.txt
index 121dfc8cc..f236d8c6d 100644
--- a/docs/specs/qcow2.txt
+++ b/docs/specs/qcow2.txt
@@ -257,7 +257,7 @@ L2 table entry:
63: 0 for a cluster that is unused or requires COW, 1 if its
refcount is exactly one. This information is only accurate
- in L2 tables that are reachable from the the active L1
+ in L2 tables that are reachable from the active L1
table.
Standard Cluster Descriptor:
diff --git a/docs/specs/rocker.txt b/docs/specs/rocker.txt
index 1c743515c..d2a82624f 100644
--- a/docs/specs/rocker.txt
+++ b/docs/specs/rocker.txt
@@ -297,7 +297,7 @@ but not fired. If only partial credits are returned, the interrupt remains
masked but the device generates an interrupt, signaling the driver that more
outstanding work is available.
-(* this masking is unrelated to to the MSI-X interrupt mask register)
+(* this masking is unrelated to the MSI-X interrupt mask register)
Endianness
----------
diff --git a/docs/specs/vhost-user.txt b/docs/specs/vhost-user.txt
index 650bb1818..0312d40af 100644
--- a/docs/specs/vhost-user.txt
+++ b/docs/specs/vhost-user.txt
@@ -87,6 +87,14 @@ Depending on the request type, payload can be:
User address: a 64-bit user address
mmap offset: 64-bit offset where region starts in the mapped memory
+* Log description
+ ---------------------------
+ | log size | log offset |
+ ---------------------------
+ log size: size of area used for logging
+ log offset: offset from start of supplied file descriptor
+ where logging starts (i.e. where guest address 0 would be logged)
+
In QEMU the vhost-user message is implemented with the following struct:
typedef struct VhostUserMsg {
@@ -98,6 +106,7 @@ typedef struct VhostUserMsg {
struct vhost_vring_state state;
struct vhost_vring_addr addr;
VhostUserMemory memory;
+ VhostUserLog log;
};
} QEMU_PACKED VhostUserMsg;
@@ -113,12 +122,15 @@ message replies. Most of the requests don't require replies. Here is a list of
the ones that do:
* VHOST_GET_FEATURES
+ * VHOST_GET_PROTOCOL_FEATURES
* VHOST_GET_VRING_BASE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
There are several messages that the master sends with file descriptors passed
in the ancillary data:
* VHOST_SET_MEM_TABLE
+ * VHOST_SET_LOG_BASE (if VHOST_USER_PROTOCOL_F_LOG_SHMFD)
* VHOST_SET_LOG_FD
* VHOST_SET_VRING_KICK
* VHOST_SET_VRING_CALL
@@ -127,6 +139,122 @@ in the ancillary data:
If Master is unable to send the full message or receives a wrong reply it will
close the connection. An optional reconnection mechanism can be implemented.
+Any protocol extensions are gated by protocol feature bits,
+which allows full backwards compatibility on both master
+and slave.
+As older slaves don't support negotiating protocol features,
+a feature bit was dedicated for this purpose:
+#define VHOST_USER_F_PROTOCOL_FEATURES 30
+
+Starting and stopping rings
+----------------------
+Client must only process each ring when it is started.
+
+Client must only pass data between the ring and the
+backend, when the ring is enabled.
+
+If ring is started but disabled, client must process the
+ring without talking to the backend.
+
+For example, for a networking device, in the disabled state
+client must not supply any new RX packets, but must process
+and discard any TX packets.
+
+If VHOST_USER_F_PROTOCOL_FEATURES has not been negotiated, the ring is initialized
+in an enabled state.
+
+If VHOST_USER_F_PROTOCOL_FEATURES has been negotiated, the ring is initialized
+in a disabled state. Client must not pass data to/from the backend until ring is enabled by
+VHOST_USER_SET_VRING_ENABLE with parameter 1, or after it has been disabled by
+VHOST_USER_SET_VRING_ENABLE with parameter 0.
+
+Each ring is initialized in a stopped state, client must not process it until
+ring is started, or after it has been stopped.
+
+Client must start ring upon receiving a kick (that is, detecting that file
+descriptor is readable) on the descriptor specified by
+VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
+VHOST_USER_GET_VRING_BASE.
+
+While processing the rings (whether they are enabled or not), client must
+support changing some configuration aspects on the fly.
+
+Multiple queue support
+----------------------
+
+Multiple queue is treated as a protocol extension, hence the slave has to
+implement protocol features first. The multiple queues feature is supported
+only when the protocol feature VHOST_USER_PROTOCOL_F_MQ (bit 0) is set.
+
+The max number of queues the slave supports can be queried with message
+VHOST_USER_GET_PROTOCOL_FEATURES. Master should stop when the number of
+requested queues is bigger than that.
+
+As all queues share one connection, the master uses a unique index for each
+queue in the sent message to identify a specified queue. One queue pair
+is enabled initially. More queues are enabled dynamically, by sending
+message VHOST_USER_SET_VRING_ENABLE.
+
+Migration
+---------
+
+During live migration, the master may need to track the modifications
+the slave makes to the memory mapped regions. The client should mark
+the dirty pages in a log. Once it complies to this logging, it may
+declare the VHOST_F_LOG_ALL vhost feature.
+
+To start/stop logging of data/used ring writes, server may send messages
+VHOST_USER_SET_FEATURES with VHOST_F_LOG_ALL and VHOST_USER_SET_VRING_ADDR with
+VHOST_VRING_F_LOG in ring's flags set to 1/0, respectively.
+
+All the modifications to memory pointed by vring "descriptor" should
+be marked. Modifications to "used" vring should be marked if
+VHOST_VRING_F_LOG is part of ring's flags.
+
+Dirty pages are of size:
+#define VHOST_LOG_PAGE 0x1000
+
+The log memory fd is provided in the ancillary data of
+VHOST_USER_SET_LOG_BASE message when the slave has
+VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol feature.
+
+The size of the log is supplied as part of VhostUserMsg
+which should be large enough to cover all known guest
+addresses. Log starts at the supplied offset in the
+supplied file descriptor.
+The log covers from address 0 to the maximum of guest
+regions. In pseudo-code, to mark page at "addr" as dirty:
+
+page = addr / VHOST_LOG_PAGE
+log[page / 8] |= 1 << page % 8
+
+Where addr is the guest physical address.
+
+Use atomic operations, as the log may be concurrently manipulated.
+
+Note that when logging modifications to the used ring (when VHOST_VRING_F_LOG
+is set for this ring), log_guest_addr should be used to calculate the log
+offset: the write to first byte of the used ring is logged at this offset from
+log start. Also note that this value might be outside the legal guest physical
+address range (i.e. does not have to be covered by the VhostUserMemory table),
+but the bit offset of the last byte of the ring must fall within
+the size supplied by VhostUserLog.
+
+VHOST_USER_SET_LOG_FD is an optional message with an eventfd in
+ancillary data, it may be used to inform the master that the log has
+been modified.
+
+Once the source has finished migration, rings will be stopped by
+the source. No further update must be done before rings are
+restarted.
+
+Protocol features
+-----------------
+
+#define VHOST_USER_PROTOCOL_F_MQ 0
+#define VHOST_USER_PROTOCOL_F_LOG_SHMFD 1
+#define VHOST_USER_PROTOCOL_F_RARP 2
+
Message types
-------------
@@ -138,6 +266,8 @@ Message types
Slave payload: u64
Get from the underlying vhost implementation the features bitmask.
+ Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
+ VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
* VHOST_USER_SET_FEATURES
@@ -146,6 +276,33 @@ Message types
Master payload: u64
Enable features in the underlying vhost implementation using a bitmask.
+ Feature bit VHOST_USER_F_PROTOCOL_FEATURES signals slave support for
+ VHOST_USER_GET_PROTOCOL_FEATURES and VHOST_USER_SET_PROTOCOL_FEATURES.
+
+ * VHOST_USER_GET_PROTOCOL_FEATURES
+
+ Id: 15
+ Equivalent ioctl: VHOST_GET_FEATURES
+ Master payload: N/A
+ Slave payload: u64
+
+ Get the protocol feature bitmask from the underlying vhost implementation.
+ Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
+ VHOST_USER_GET_FEATURES.
+ Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
+ this message even before VHOST_USER_SET_FEATURES was called.
+
+ * VHOST_USER_SET_PROTOCOL_FEATURES
+
+ Id: 16
+ Ioctl: VHOST_SET_FEATURES
+ Master payload: u64
+
+ Enable protocol features in the underlying vhost implementation.
+ Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
+ VHOST_USER_GET_FEATURES.
+ Note: slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support
+ this message even before VHOST_USER_SET_FEATURES was called.
* VHOST_USER_SET_OWNER
@@ -160,11 +317,13 @@ Message types
* VHOST_USER_RESET_OWNER
Id: 4
- Equivalent ioctl: VHOST_RESET_OWNER
Master payload: N/A
- Issued when a new connection is about to be closed. The Master will no
- longer own this connection (and will usually close it).
+ This is no longer used. Used to be sent to request disabling
+ all rings, but some clients interpreted it to also discard
+ connection state (this interpretation would lead to bugs).
+ It is recommended that clients either ignore this message,
+ or use it to disable all rings.
* VHOST_USER_SET_MEM_TABLE
@@ -182,8 +341,14 @@ Message types
Id: 6
Equivalent ioctl: VHOST_SET_LOG_BASE
Master payload: u64
+ Slave payload: N/A
+
+ Sets logging shared memory space.
+ When slave has VHOST_USER_PROTOCOL_F_LOG_SHMFD protocol
+ feature, the log memory fd is provided in the ancillary data of
+ VHOST_USER_SET_LOG_BASE message, the size and offset of shared
+ memory area provided in the message.
- Sets the logging base address.
* VHOST_USER_SET_LOG_FD
@@ -264,3 +429,38 @@ Message types
Bits (0-7) of the payload contain the vring index. Bit 8 is the
invalid FD flag. This flag is set when there is no file descriptor
in the ancillary data.
+
+ * VHOST_USER_GET_QUEUE_NUM
+
+ Id: 17
+ Equivalent ioctl: N/A
+ Master payload: N/A
+ Slave payload: u64
+
+ Query how many queues the backend supports. This request should be
+ sent only when VHOST_USER_PROTOCOL_F_MQ is set in quried protocol
+ features by VHOST_USER_GET_PROTOCOL_FEATURES.
+
+ * VHOST_USER_SET_VRING_ENABLE
+
+ Id: 18
+ Equivalent ioctl: N/A
+ Master payload: vring state description
+
+ Signal slave to enable or disable corresponding vring.
+ This request should be sent only when VHOST_USER_F_PROTOCOL_FEATURES
+ has been negotiated.
+
+ * VHOST_USER_SEND_RARP
+
+ Id: 19
+ Equivalent ioctl: N/A
+ Master payload: u64
+
+ Ask vhost user backend to broadcast a fake RARP to notify the migration
+ is terminated for guest that does not support GUEST_ANNOUNCE.
+ Only legal if feature bit VHOST_USER_F_PROTOCOL_FEATURES is present in
+ VHOST_USER_GET_FEATURES and protocol feature bit VHOST_USER_PROTOCOL_F_RARP
+ is present in VHOST_USER_GET_PROTOCOL_FEATURES.
+ The first 6 bytes of the payload contain the mac address of the guest to
+ allow the vhost user backend to construct and broadcast the fake RARP.