summaryrefslogtreecommitdiff
path: root/kdbus.txt
blob: 9b80cc0508af0384c1a10c2aabfcfff408da20a2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
D-Bus is a system for low-latency, low-overhead, easy to use interprocess
communication (IPC).

The focus of this document is an overview of the low-level, native kernel D-Bus
transport called kdbus. Kdbus in the kernel acts similar to a device driver,
all communication between processes take place over special device nodes in
/dev/kdbus/.

For the general D-Bus protocol specification, the payload format, the
marshalling, the communication semantics, please refer to:
  http://dbus.freedesktop.org/doc/dbus-specification.html

For a kdbus specific userspace library implementation please refer to:
  http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-bus.h
  http://cgit.freedesktop.org/systemd/systemd/tree/src/systemd/sd-memfd.h

===============================================================================
Terminology
===============================================================================
  Namespace:
    A namespace is a named object containing a number of buses. A system
    container which contains its own init system and users usually also
    runs in its own kdbus namespace. The /dev/kdbus/ns/<container-name>/
    directory shows up inside the namespace as /dev/kdbus/. Every namespace
    offers a "control" device node to create new buses or namespaces.
    Namespaces have no connection to each other, cannot see or talk to
    each other. Only from the initial namespace, given the process has the
    needed access rights, the device nodes inside of other namespaces
    can be seen.

  Bus:
    A bus is a named object inside a namespace. Clients exchange messages
    over a bus. Multiple buses themselves have no connection to each other,
    messages are only exchanged on the same bus. The default entry point to a
    bus, where clients establish the connection to, is the "bus" device node
    /dev/kdbus/<bus name>/bus.
    Common operating system setups create one "system bus" per system, and one
    "user bus" for every logged-in user. Applications or services can create
    their own private named buses if they want to.

  Endpoint:
    An endpoint provides the device node to talk to a bus. Every bus has
    a default endpoint called "bus". A bus can offer additional endpoints
    with custom names to provide a restricted access to the same bus. Custom
    endpoints can carry additional policy which can be used to give sandboxed
    processes only a locked-down, limited, filtered access access to a bus.

  Connection:
    A connection to a bus is created by opening an endpoint device node of
    a bus, and becoming an active client with the HELLO exchange. Every
    connected client connection has a unique identifier on the bus, and can
    address messages to every other connection on the same bus by using
    the peers connection id as the destination.

  Well-known Names:
    A connection can, in addition to is implicit unique connection id, request
    the ownership of a textual well-known name. Well-known names are noted
    in revese-domain notation like com.example.service. Connections offering
    a service on a bus are usually reached by its well-known name. The analogy
    of connection id and well-known name is an IP address and a DNS name
    associated with that address.

===============================================================================
Device Node Layout
===============================================================================
  /sys/bus/kdbus
  `-- devices
    |-- kdbus!0-system!bus -> ../../../devices/virtual/kdbus/kdbus!0-system!bus
    |-- kdbus!2702-user!bus -> ../../../devices/virtual/kdbus/kdbus!2702-user!bus
    |-- kdbus!2702-user!ep.app -> ../../../devices/virtual/kdbus/kdbus!2702-user!ep.app
    `-- kdbus!control -> ../../../devices/kdbus!control

  /dev/kdbus
  |-- control
  |-- 0-system
  |   |-- bus
  |   `-- ep.apache
  |-- 1000-user
  |   `-- bus
  |-- 2702-user
  |   |-- bus
  |   `-- ep.app
  `-- ns
      |-- fedoracontainer
      |   |-- control
      |   |-- 0-system
      |   |   `-- bus
      |   `-- 1000-user
      |       `-- bus
      `-- mydebiancontainer
          |-- control
          `-- 0-system
              `-- bus

Note:
  The device node subdirectory layout is arranged that a future version of
  kdbus could be implemented as a filesystem with a separate instance mounted
  for each namespace. For any future changes, this always needs to be kept
  in mind. Also the dependency on udev's userspace hookups or sysfs attribute
  use should be limited for the same reason.

===============================================================================
Data Structures
===============================================================================
  +-------------------------------------------------------------------------+
  | Namespace (Init Namespace)                                              |
  | /dev/kdbus/control                                                      |
  | +---------------------------------------------------------------------+ |
  | | Bus (System Bus)                                                    | |
  | | ./0-system/control                                                  | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | | Endpoint                      | | Endpoint                      | | |
  | | | ./bus                         | | ./ep.sandbox                  | | |
  | | | +------------+ +------------+ | | +------------+ +------------+ | | |
  | | | | Connection | | Connection | | | | Connection | | Connection | | | |
  | | | | :1.22      | | :1.25      | | | | :1.55      | | :1:81      | | | |
  | | | +------------+ +------------+ | | +------------+ +------------+ | | |
  | | +-------------------------------+ +-------------------------------+ | |
  | +---------------------------------------------------------------------+ |
  |                                                                         |
  | +---------------------------------------------------------------------+ |
  | | Bus (User Bus for UID 2702)                                         | |
  | | /dev/kdbus/2702-user/                                               | |
  | | +-------------------------------+ +-------------------------------+ | |
  | | | Endpoint                      | | Endpoint                      | | |
  | | | /dev/kdbus/2702-user/bus      | | /dev/kdbus/2702-user/ep.app   | | |
  | | | +------------+ +------------+ | | +------------+ +------------+ | | |
  | | | | Connection | | Connection | | | | Connection | | Connection | | | |
  | | | | :1.22      | | :1.25      | | | | :1.55      | | :1:81      | | | |
  | | | +------------+ +------------+ | | +------------+ +------------+ | | |
  | | +-------------------------------+ +-------------------------------+ | |
  | +---------------------------------------------------------------------+ |
  +-------------------------------------------------------------------------+
  | Namespace (Container; inside it, fedoracontainer/ becomes /dev/kdbus/)  |
  | /dev/kdbus/ns/fedoracontainer/control                                   |
  | +---------------------------------------------------------------------+ |
  | | Bus                                                                 | |
  | | ./0-system/                                                         | |
  | | +---------------------------------+                                 | |
  | | | Endpoint                        |                                 | |
  | | | ./bus                           |                                 | |
  | | | +-------------+ +-------------+ |                                 | |
  | | | | Connection  | | Connection  | |                                 | |
  | | | | :1.22       | | :1.25       | |                                 | |
  | | | +-------------+ +-------------+ |                                 | |
  | | +---------------------------------+                                 | |
  | +---------------------------------------------------------------------+ |
  |                                                                         |
  | +---------------------------------------------------------------------+ |
  | | Bus                                                                 | |
  | | /dev/kdbus/2702-user/                                               | |
  | | +---------------------------------+                                 | |
  | | | Endpoint                        |                                 | |
  | | | /dev/kdbus/2702-user/bus        |                                 | |
  | | | +-------------+ +-------------+ |                                 | |
  | | | | Connection  | | Connection  | |                                 | |
  | | | | :1.22       | | :1.25       | |                                 | |
  | | | +-------------+ +-------------+ |                                 | |
  | | +---------------------------------+                                 | |
  | +---------------------------------------------------------------------+ |
  +-------------------------------------------------------------------------+

===============================================================================
Creation of new Namespaces and Buses
===============================================================================
The inital kdbus namespace is unconditionally created by the kernel module. A
namespace contains a "control" device node which allows to create a new bus or
namespace. New namespaces do not have any buses created by default.

Opening the control device node returns a file descriptor, it accepts the
ioctls KDBUS_CMD_BUS_MAKE/KDBUS_CMD_NS_MAKE which specify the name of the new
bus or namespace to create. The control file descriptor needs to be kept open
for the entire life-time of the created bus or namespace, closing it will
immediately cleanup the entire bus or namespace and all its associated
resources and connections. Every control file descriptor can only be used once
to create a new bus or namespace; from that point, it is not used for any
further communication than the final close().

===============================================================================
Connection IDs and Well-Known Connection Names
===============================================================================
Connections are identified by their connection id, internally implemented as a
uint64_t counter. The IDs of every newly created bus start at 1, and every new
connection will increment the counter by 1. The ids are not reused.

In higher level tools, the user visible representation of a connection is
defined by the D-Bus protocol specification as ":1.<id>".

Messages with a specific uint64_t destination id are directly delivered to
the connection with the corresponding id. Messages with the special destination
id 0xffffffffffffffff are broadcast messages and are potentially delivered
to all known connections on the bus; clients interested in broadcast messages
need to subscribe to the specific messages they are interested though, before
any broadcast message reaches them.

Messages synthesized and sent directly by the kernel, will carry the special
source id 0.

In addition to the unique uint64_t connection id, established connections can
request the ownership of well-known names, under which they can be found and
addressed by other bus clients. A well-known name is associated with one and
only one connection at a time.

Messages can specify the special destination id 0 and carry a well-known name
in the message data. Such a message is delivered to the destination connection
which owns that well-known name.

  +-------------------------------------------------------------------------+
  | +---------------+     +---------------------------+                     |
  | | Connection    |     | Message                   | -----------------+  |
  | | :1.22         | --> | src: 22                   |                  |  |
  | |               |     | dst: 25                   |                  |  |
  | |               |     |                           |                  |  |
  | |               |     |                           |                  |  |
  | |               |     +---------------------------+                  |  |
  | |               |                                                    |  |
  | |               | <--------------------------------------+           |  |
  | +---------------+                                        |           |  |
  |                                                          |           |  |
  | +---------------+     +---------------------------+      |           |  |
  | | Connection    |     | Message                   | -----+           |  |
  | | :1.25         | --> | src: 25                   |                  |  |
  | |               |     | dst: 0xffffffffffffffff   | -------------+   |  |
  | |               |     |                           |              |   |  |
  | |               |     |                           | ---------+   |   |  |
  | |               |     +---------------------------+          |   |   |  |
  | |               |                                            |   |   |  |
  | |               | <--------------------------------------------------+  |
  | +---------------+                                            |   |      |
  |                                                              |   |      |
  | +---------------+     +---------------------------+          |   |      |
  | | Connection    |     | Message                   | --+      |   |      |
  | | :1.55         | --> | src: 55                   |   |      |   |      |
  | |               |     | dst: 0 / org.foo.bar      |   |      |   |      |
  | |               |     |                           |   |      |   |      |
  | |               |     |                           |   |      |   |      |
  | |               |     +---------------------------+   |      |   |      |
  | |               |                                     |      |   |      |
  | |               | <------------------------------------------+   |      |
  | +---------------+                                     |          |      |
  |                                                       |          |      |
  | +---------------+                                     |          |      |
  | | Connection    |                                     |          |      |
  | | :1.81         |                                     |          |      |
  | | org.foo.bar   |                                     |          |      |
  | |               |                                     |          |      |
  | |               |                                     |          |      |
  | |               | <-----------------------------------+          |      |
  | |               |                                                |      |
  | |               | <----------------------------------------------+      |
  | +---------------+                                                       |
  +-------------------------------------------------------------------------+

===============================================================================
Message Format, Content, Exchange
===============================================================================
Messages consist of fixed-size header followed directly be a list of
variable-sized data records. The overall message size is specified in the
header of the message. The chain of data records can contain well-defined
message metadata fields, raw data, references to data, or file descriptors.

Messages are passed to the kernel with the ioctl KDBUS_CMD_MSG_SEND. Depending
on the the destination address of the message, the kernel delivers the message
to the specific destination connection or to all connections on the same bus.
Messages are always queued in the destination connection.

Messages are received by the client with the ioctl KDBUS_CMD_MSG_RECV. The
endpoint device node of the bus supports poll() to wake up the receiving
process when new messages are queued up to be received.

  +-------------------------------------------------------------------------+
  | Message                                                                 |
  | +---------------------------------------------------------------------+ |
  | | Header                                                              | |
  | | size: overall message size, including the data records              | |
  | | destination: connection id of the receiver                          | |
  | | source: connection id of the sender (set by kernel)                 | |
  | | payload_type: "DBusVer1" textual identifier stored as uint64_t      | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size (without padding)                         | |
  | | type: type of data                                                  | |
  | | data: reference to data (address or file descriptor)                | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | padding bytes to the next 8 byte alignment                          | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size (without padding)                         | |
  | | ...                                                                 | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | padding bytes to the next 8 byte alignment                          | |
  | +---------------------------------------------------------------------+ |
  | +---------------------------------------------------------------------+ |
  | | Data Record                                                         | |
  | | size: overall record size                                           | |
  | | ...                                                                 | |
  | +---------------------------------------------------------------------+ |
  +-------------------------------------------------------------------------+

===============================================================================
Passing of Payload Data
===============================================================================
When connectiong to the bus, receivers have to register a memory pool, large
enough to carry all backlog of data enqueued for the connection. The pool is
usually an MAP_ANONYMOUS area created upfront with mmap().

KDBUS_MSG_PAYLOAD_VEC:
Messages are directly copied by the sending process into the receiver's pool,
that way two peers can exchange data by effectively doing a single-copy from
one process to another, the kernel will not buffer the data anywhere else.

KDBUS_MSG_PAYLOAD_MEMFD:
Messages can reference kdbus_memfd special files which contain the data.
Kdbus_memfd files have special semantics, which allow the sealing of the
content of the file, sealing prevents all writable access to the file content.
Only sealed kdbus_memfd files are accepted as payload data, which enforces
reliable passing of data; the receiver can assume that the sender and nobody
else can alter the content after the message is sent.

Apart from the sender filling-in the content into the kdbus_memfd file, the
data will be passed as zero-copy from one process to another, read-only, shared
between the peers.

The sealing of a kdbus_memfd can be removed again by the sender or the
receiver, as soon as the kdbus_memfd is not shared anymore.

===============================================================================
Ioctl API
===============================================================================
  KDBUS_CMD_BUS_MAKE
    After opening the "control" device node, this commands creates a new bus
    with the specified name. The bus is immediately shut down and cleaned up
    when the opened "control" device node is closed.

  KDBUS_CMD_NS_MAKE
   Similar to KDBUS_CMD_BUS_MAKE, but it creates a new kdbus namespace.

  KDBUS_CMD_EP_MAKE
   Creates a new named special endpoint to talk to the bus. Such endpoints
   usually carry a more restictive policy and grant restricted access to
   specific applications.

  KDBUS_CMD_HELLO
   By opening the bus device node a connection is created. After a HELLO
   the opened connection becomes an active peer on the bus.

  KDBUS_CMD_MSG_SEND
   Send a message and pass data from userspace to the kernel.

  KDBUS_CMD_MSG_RECV
   Receive a message from the kernel which is placed in the receiver's
   pool.

  KDBUS_CMD_MSG_RELEASE
   Release the memory a message occupies and free the area in the pool.

  KDBUS_CMD_NAME_ACQUIRE
   Request a well-know bus name to associate with the connection. Well-known
   names are used to address a peer on the bus.

  KDBUS_CMD_NAME_RELEASE
   Release a well-known name the connection currently owns.

  KDBUS_CMD_NAME_LIST
   Retrieve the list of all currently registered well-known names.

  KDBUS_CMD_NAME_QUERY
   Retrieve properties and the state of a well-known name.

  KDBUS_CMD_MATCH_ADD
   Install a match which broadcast messages should be delivered to the
   connection.

  KDBUS_CMD_MATCH_REMOVE
   Remove a current match for broadcast messages.

  KDBUS_CMD_MONITOR
   Monitor the bus and receive all transmitted messaages. Privileges are
   required for this operation.

  KDBUS_CMD_EP_POLICY_SET
   Set the polic of an endpoint. It is used to restrict the access for
   endpoints created with KDBUS_CMD_EP_MAKE.

  KDBUS_CMD_MEMFD_NEW
   Return a new file descriptor which provides an anonymous shared memory
   file and which can be used to pass around larger chunks of data. Kdbus
   memfd files can be sealed, which allows the receiver to trust the data
   it has received.

   Kdbus memfd file expose only very limited operations, they can be
   mmap()ed, seek()ed, (p)read(v)() and (p)write(v)(); most other common
   file operations are not implemented. Special caution needs to be taken
   with read(v)()/write(v)() on a shared file; the underlying file position
   is always shared between all users of the file and race against each
   other, pread(v)()/pwrite(v)() avoid these issues.

  KDBUS_CMD_MEMFD_SIZE_GET
   Return the size of the underlying file, which changes with write().

  KDBUS_CMD_MEMFD_SIZE_SET
   Truncate the underlying file to the specified size.

  KDBUS_CMD_MEMFD_SEAL_GET
   Return the state of the file sealing.

  KDBUS_CMD_MEMFD_SEAL_SET
   Seal or break a seal of the file. Only files which are not shared with
   other processes and which are currently not mapped can be sealed. The
   current process needs to be the one and single owner of the file, the
   sealing cannot be changed als long as the file is shared

===============================================================================
API Error Codes
===============================================================================
E2BIG
  A message contains too many records or items.

EADDRNOTAVAIL
  A message flagged not to activate a service, addressed a service which is
  not currently running.

EAGAIN
  No messages are queued at the moment.

EBADF
  File descriptors passed with the message are not valid.

EBADFD
  A bus connection is in a corrupted state.

EBADMSG
  Passed data contains a combination of conflicting or inconsistent types.

EBUSY
  A well-know bus name is already taken by another connection.

ECOMM
  A peer does not accept the file descriptors addressed to it.

EDESTADDRREQ
  The well-known bus name is missing, to address the destination.

EDOM
  The size of data does not match the expectations. Used for the
  size of the bloom filter bit field.

EEXIST
  A requested namespace, bus or endpoint with the same name already
  exists.

  A specific data type, which is only expected once, is provided multiple
  times.

EFAULT
  The supplied memory could not be accessed, or the data is not properly
  aligned.

EINVAL
  The provided data does not match its type or other expectations, like a
  string which is not NUL terminated, or a string length that points behind
  the first NUL byte in the string.

EMEDIUMTYPE
  A file descriptor which is not a kdbus memfd was refused to send
  as KDBUS_MSG_PAYLOAD_MEMFD.

EMFILE
  Too many file descriptors have been supplied with a message.

EMSGSIZE
  The supplied data is larger than the allowed maximum size.

ENAMETOOLONG
  The requested name is larger than the allowed maximum size.

ENOBUFS
  There is no space left for the submitted data to fit into the receiver's
  pool.

ENOMEM
  Out of memory.

ENOSYS
  The requested functionality is not available.

ENOTCONN
  The addressed peer is not an active connection.

ENOTSUPP
  The feature negotiation failed, a not supported feature was requested.

ENOTTY
  An unknown ioctl command was received.

ENOTUNIQ
  A specific data type was addressed to a broadcast address, but
  only direct addresses support this kind of data.

ENXIO
  A unique address does not exist.

EPERM
  The policy prevented an operation. The requested resource is owned by
  another entity.

ESHUTDOWN
  The connection is currently shutting down, no further operations are
  possible.

ESRCH
  A requested well-known bus name is not found.

ETXTBSY
  A kdbus memfd file cannot be sealed or the seal removed, because it is
  shared with other processes or still mmap()ed.

EXFULL
  The size limits in the pool are reached, no data of the size tried to
  submit can be queued.