D-Bus is a system for low-latency, low-overhead, easy to use interprocess communication (IPC). The focus of this document is an overview of the low-level, native kernel D-Bus transport called kdbus. Kdbus in the kernel acts similar to a device driver, all communication between processes take place over special device nodes in /dev/kdbus/. For the general D-Bus protocol specification, the payload format, the marshalling, the communication semantics, please refer to: http://dbus.freedesktop.org/doc/dbus-specification.html For a kdbus specific userspace library implementation please refer to: http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-memfd.h =============================================================================== Terminology =============================================================================== Namespace: A namespace is a named object containing a number of buses. A system container which contains its own init system and users usually also runs in its own kdbus namespace. The /dev/kdbus/ns// directory shows up inside the namespace as /dev/kdbus/. Every namespace offers a "control" device node to create new buses or namespaces. Namespaces have no connection to each other, cannot see or talk to each other. Only from the initial namespace, given the process has the needed access rights, the device nodes inside of other namespaces can be seen. Bus: A bus is a named object inside a namespace. Clients exchange messages over a bus. Multiple buses themselves have no connection to each other, messages are only exchanged on the same bus. The default entry point to a bus, where clients establish the connection to, is the "bus" device node /dev/kdbus//bus. Common operating system setups create one "system bus" per system, and one "user bus" for every logged-in user. Applications or services can create their own private named buses if they want to. Endpoint: An endpoint provides the device node to talk to a bus. Every bus has a default endpoint called "bus". A bus can offer additional endpoints with custom names to provide a restricted access to the same bus. Custom endpoints can carry additional policy which can be used to give sandboxed processes only a locked-down, limited, filtered access to a bus. Connection: A connection to a bus is created by opening an endpoint device node of a bus, and becoming an active client with the HELLO exchange. Every connected client connection has a unique identifier on the bus, and can address messages to every other connection on the same bus by using the peer's connection id as the destination. Well-known Names: A connection can, in addition to its implicit unique connection id, request the ownership of a textual well-known name. Well-known names are noted in reverse-domain notation like com.example.service. Connections offering a service on a bus are usually reached by its well-known name. The analogy of connection id and well-known name is an IP address and a DNS name associated with that address. =============================================================================== Device Node Layout =============================================================================== /sys/bus/kdbus `-- devices |-- kdbus!0-system!bus -> ../../../devices/virtual/kdbus/kdbus!0-system!bus |-- kdbus!2702-user!bus -> ../../../devices/virtual/kdbus/kdbus!2702-user!bus |-- kdbus!2702-user!ep.app -> ../../../devices/virtual/kdbus/kdbus!2702-user!ep.app `-- kdbus!control -> ../../../devices/kdbus!control /dev/kdbus |-- control |-- 0-system | |-- bus | `-- ep.apache |-- 1000-user | `-- bus |-- 2702-user | |-- bus | `-- ep.app `-- ns |-- fedoracontainer | |-- control | |-- 0-system | | `-- bus | `-- 1000-user | `-- bus `-- mydebiancontainer |-- control `-- 0-system `-- bus Note: The device node subdirectory layout is arranged that a future version of kdbus could be implemented as a filesystem with a separate instance mounted for each namespace. For any future changes, this always needs to be kept in mind. Also the dependency on udev's userspace hookups or sysfs attribute use should be limited for the same reason. =============================================================================== Data Structures =============================================================================== +-------------------------------------------------------------------------+ | Namespace (Init Namespace) | | /dev/kdbus/control | | +---------------------------------------------------------------------+ | | | Bus (System Bus) | | | | ./0-system/control | | | | +-------------------------------+ +-------------------------------+ | | | | | Endpoint | | Endpoint | | | | | | ./bus | | ./ep.sandbox | | | | | | +------------+ +------------+ | | +------------+ +------------+ | | | | | | | Connection | | Connection | | | | Connection | | Connection | | | | | | | | :1.22 | | :1.25 | | | | :1.55 | | :1:81 | | | | | | | +------------+ +------------+ | | +------------+ +------------+ | | | | | +-------------------------------+ +-------------------------------+ | | | +---------------------------------------------------------------------+ | | | | +---------------------------------------------------------------------+ | | | Bus (User Bus for UID 2702) | | | | /dev/kdbus/2702-user/ | | | | +-------------------------------+ +-------------------------------+ | | | | | Endpoint | | Endpoint | | | | | | /dev/kdbus/2702-user/bus | | /dev/kdbus/2702-user/ep.app | | | | | | +------------+ +------------+ | | +------------+ +------------+ | | | | | | | Connection | | Connection | | | | Connection | | Connection | | | | | | | | :1.22 | | :1.25 | | | | :1.55 | | :1:81 | | | | | | | +------------+ +------------+ | | +------------+ +------------+ | | | | | +-------------------------------+ +-------------------------------+ | | | +---------------------------------------------------------------------+ | +-------------------------------------------------------------------------+ | Namespace (Container; inside it, fedoracontainer/ becomes /dev/kdbus/) | | /dev/kdbus/ns/fedoracontainer/control | | +---------------------------------------------------------------------+ | | | Bus | | | | ./0-system/ | | | | +---------------------------------+ | | | | | Endpoint | | | | | | ./bus | | | | | | +-------------+ +-------------+ | | | | | | | Connection | | Connection | | | | | | | | :1.22 | | :1.25 | | | | | | | +-------------+ +-------------+ | | | | | +---------------------------------+ | | | +---------------------------------------------------------------------+ | | | | +---------------------------------------------------------------------+ | | | Bus | | | | /dev/kdbus/2702-user/ | | | | +---------------------------------+ | | | | | Endpoint | | | | | | /dev/kdbus/2702-user/bus | | | | | | +-------------+ +-------------+ | | | | | | | Connection | | Connection | | | | | | | | :1.22 | | :1.25 | | | | | | | +-------------+ +-------------+ | | | | | +---------------------------------+ | | | +---------------------------------------------------------------------+ | +-------------------------------------------------------------------------+ =============================================================================== Creation of new Namespaces and Buses =============================================================================== The initial kdbus namespace is unconditionally created by the kernel module. A namespace contains a "control" device node which allows to create a new bus or namespace. New namespaces do not have any buses created by default. Opening the control device node returns a file descriptor, it accepts the ioctls KDBUS_CMD_BUS_MAKE/KDBUS_CMD_NS_MAKE which specify the name of the new bus or namespace to create. The control file descriptor needs to be kept open for the entire life-time of the created bus or namespace, closing it will immediately cleanup the entire bus or namespace and all its associated resources and connections. Every control file descriptor can only be used once to create a new bus or namespace; from that point, it is not used for any further communication than the final close(). =============================================================================== Connection IDs and Well-Known Connection Names =============================================================================== Connections are identified by their connection id, internally implemented as a uint64_t counter. The IDs of every newly created bus start at 1, and every new connection will increment the counter by 1. The ids are not reused. In higher level tools, the user visible representation of a connection is defined by the D-Bus protocol specification as ":1.". Messages with a specific uint64_t destination id are directly delivered to the connection with the corresponding id. Messages with the special destination id 0xffffffffffffffff are broadcast messages and are potentially delivered to all known connections on the bus; clients interested in broadcast messages need to subscribe to the specific messages they are interested though, before any broadcast message reaches them. Messages synthesized and sent directly by the kernel, will carry the special source id 0. In addition to the unique uint64_t connection id, established connections can request the ownership of well-known names, under which they can be found and addressed by other bus clients. A well-known name is associated with one and only one connection at a time. Messages can specify the special destination id 0 and carry a well-known name in the message data. Such a message is delivered to the destination connection which owns that well-known name. +-------------------------------------------------------------------------+ | +---------------+ +---------------------------+ | | | Connection | | Message | -----------------+ | | | :1.22 | --> | src: 22 | | | | | | | dst: 25 | | | | | | | | | | | | | | | | | | | | +---------------------------+ | | | | | | | | | | <--------------------------------------+ | | | +---------------+ | | | | | | | | +---------------+ +---------------------------+ | | | | | Connection | | Message | -----+ | | | | :1.25 | --> | src: 25 | | | | | | | dst: 0xffffffffffffffff | -------------+ | | | | | | | | | | | | | | | ---------+ | | | | | | +---------------------------+ | | | | | | | | | | | | | | <--------------------------------------------------+ | | +---------------+ | | | | | | | | +---------------+ +---------------------------+ | | | | | Connection | | Message | --+ | | | | | :1.55 | --> | src: 55 | | | | | | | | | dst: 0 / org.foo.bar | | | | | | | | | | | | | | | | | | | | | | | | | | +---------------------------+ | | | | | | | | | | | | | | <------------------------------------------+ | | | +---------------+ | | | | | | | | +---------------+ | | | | | Connection | | | | | | :1.81 | | | | | | org.foo.bar | | | | | | | | | | | | | | | | | | | <-----------------------------------+ | | | | | | | | | | <----------------------------------------------+ | | +---------------+ | +-------------------------------------------------------------------------+ =============================================================================== Message Format, Content, Exchange =============================================================================== Messages consist of fixed-size header followed directly be a list of variable-sized data records. The overall message size is specified in the header of the message. The chain of data records can contain well-defined message metadata fields, raw data, references to data, or file descriptors. Messages are passed to the kernel with the ioctl KDBUS_CMD_MSG_SEND. Depending on the the destination address of the message, the kernel delivers the message to the specific destination connection or to all connections on the same bus. Messages are always queued in the destination connection. Messages are received by the client with the ioctl KDBUS_CMD_MSG_RECV. The endpoint device node of the bus supports poll() to wake up the receiving process when new messages are queued up to be received. +-------------------------------------------------------------------------+ | Message | | +---------------------------------------------------------------------+ | | | Header | | | | size: overall message size, including the data records | | | | destination: connection id of the receiver | | | | source: connection id of the sender (set by kernel) | | | | payload_type: "DBusVer1" textual identifier stored as uint64_t | | | +---------------------------------------------------------------------+ | | +---------------------------------------------------------------------+ | | | Data Record | | | | size: overall record size (without padding) | | | | type: type of data | | | | data: reference to data (address or file descriptor) | | | +---------------------------------------------------------------------+ | | +---------------------------------------------------------------------+ | | | padding bytes to the next 8 byte alignment | | | +---------------------------------------------------------------------+ | | +---------------------------------------------------------------------+ | | | Data Record | | | | size: overall record size (without padding) | | | | ... | | | +---------------------------------------------------------------------+ | | +---------------------------------------------------------------------+ | | | padding bytes to the next 8 byte alignment | | | +---------------------------------------------------------------------+ | | +---------------------------------------------------------------------+ | | | Data Record | | | | size: overall record size | | | | ... | | | +---------------------------------------------------------------------+ | +-------------------------------------------------------------------------+ =============================================================================== Passing of Payload Data =============================================================================== When connecting to the bus, receivers have to register a memory pool, large enough to carry all backlog of data enqueued for the connection. The pool is usually an MAP_ANONYMOUS area created upfront with mmap(). KDBUS_MSG_PAYLOAD_VEC: Messages are directly copied by the sending process into the receiver's pool, that way two peers can exchange data by effectively doing a single-copy from one process to another, the kernel will not buffer the data anywhere else. KDBUS_MSG_PAYLOAD_MEMFD: Messages can reference kdbus_memfd special files which contain the data. Kdbus_memfd files have special semantics, which allow the sealing of the content of the file, sealing prevents all writable access to the file content. Only sealed kdbus_memfd files are accepted as payload data, which enforces reliable passing of data; the receiver can assume that the sender and nobody else can alter the content after the message is sent. Apart from the sender filling-in the content into the kdbus_memfd file, the data will be passed as zero-copy from one process to another, read-only, shared between the peers. The sealing of a kdbus_memfd can be removed again by the sender or the receiver, as soon as the kdbus_memfd is not shared anymore. =============================================================================== Ioctl API =============================================================================== KDBUS_CMD_BUS_MAKE After opening the "control" device node, this command creates a new bus with the specified name. The bus is immediately shut down and cleaned up when the opened "control" device node is closed. KDBUS_CMD_NS_MAKE Similar to KDBUS_CMD_BUS_MAKE, but it creates a new kdbus namespace. KDBUS_CMD_EP_MAKE Creates a new named special endpoint to talk to the bus. Such endpoints usually carry a more restrictive policy and grant restricted access to specific applications. KDBUS_CMD_HELLO By opening the bus device node a connection is created. After a HELLO the opened connection becomes an active peer on the bus. KDBUS_CMD_MSG_SEND Send a message and pass data from userspace to the kernel. KDBUS_CMD_MSG_RECV Receive a message from the kernel which is placed in the receiver's pool. KDBUS_CMD_MSG_RELEASE Release the memory a message occupies and free the area in the pool. KDBUS_CMD_NAME_ACQUIRE Request a well-known bus name to associate with the connection. Well-known names are used to address a peer on the bus. KDBUS_CMD_NAME_RELEASE Release a well-known name the connection currently owns. KDBUS_CMD_NAME_LIST Retrieve the list of all currently registered well-known names. KDBUS_CMD_NAME_QUERY Retrieve properties and the state of a well-known name. KDBUS_CMD_MATCH_ADD Install a match which broadcast messages should be delivered to the connection. KDBUS_CMD_MATCH_REMOVE Remove a current match for broadcast messages. KDBUS_CMD_MONITOR Monitor the bus and receive all transmitted messages. Privileges are required for this operation. KDBUS_CMD_EP_POLICY_SET Set the policy of an endpoint. It is used to restrict the access for endpoints created with KDBUS_CMD_EP_MAKE. KDBUS_CMD_MEMFD_NEW Return a new file descriptor which provides an anonymous shared memory file and which can be used to pass around larger chunks of data. Kdbus memfd files can be sealed, which allows the receiver to trust the data it has received. Kdbus memfd file expose only very limited operations, they can be mmap()ed, seek()ed, (p)read(v)() and (p)write(v)(); most other common file operations are not implemented. Special caution needs to be taken with read(v)()/write(v)() on a shared file; the underlying file position is always shared between all users of the file and race against each other, pread(v)()/pwrite(v)() avoid these issues. KDBUS_CMD_MEMFD_SIZE_GET Return the size of the underlying file, which changes with write(). KDBUS_CMD_MEMFD_SIZE_SET Truncate the underlying file to the specified size. KDBUS_CMD_MEMFD_SEAL_GET Return the state of the file sealing. KDBUS_CMD_MEMFD_SEAL_SET Seal or break a seal of the file. Only files which are not shared with other processes and which are currently not mapped can be sealed. The current process needs to be the one and single owner of the file, the sealing cannot be changed as long as the file is shared. =============================================================================== API Error Codes =============================================================================== E2BIG A message contains too many records or items. EADDRNOTAVAIL A message flagged not to activate a service, addressed a service which is not currently running. EAGAIN No messages are queued at the moment. EBADF File descriptors passed with the message are not valid. EBADFD A bus connection is in a corrupted state. EBADMSG Passed data contains a combination of conflicting or inconsistent types. EBUSY A well-known bus name is already taken by another connection. ECOMM A peer does not accept the file descriptors addressed to it. EDESTADDRREQ The well-known bus name is missing, to address the destination. EDOM The size of data does not match the expectations. Used for the size of the bloom filter bit field. EEXIST A requested namespace, bus or endpoint with the same name already exists. A specific data type, which is only expected once, is provided multiple times. EFAULT The supplied memory could not be accessed, or the data is not properly aligned. EINVAL The provided data does not match its type or other expectations, like a string which is not NUL terminated, or a string length that points behind the first NUL byte in the string. EMEDIUMTYPE A file descriptor which is not a kdbus memfd was refused to send as KDBUS_MSG_PAYLOAD_MEMFD. EMFILE Too many file descriptors have been supplied with a message. EMSGSIZE The supplied data is larger than the allowed maximum size. ENAMETOOLONG The requested name is larger than the allowed maximum size. ENOBUFS There is no space left for the submitted data to fit into the receiver's pool. ENOMEM Out of memory. ENOSYS The requested functionality is not available. ENOTCONN The addressed peer is not an active connection. ENOTSUPP The feature negotiation failed, a not supported feature was requested. ENOTTY An unknown ioctl command was received. ENOTUNIQ A specific data type was addressed to a broadcast address, but only direct addresses support this kind of data. ENXIO A unique address does not exist. EPERM The policy prevented an operation. The requested resource is owned by another entity. ESHUTDOWN The connection is currently shutting down, no further operations are possible. ESRCH A requested well-known bus name is not found. ETXTBSY A kdbus memfd file cannot be sealed or the seal removed, because it is shared with other processes or still mmap()ed. EXFULL The size limits in the pool are reached, no data of the size tried to submit can be queued.