1.. _rtio:
2
3Real Time I/O (RTIO)
4####################
5
6.. contents::
7  :local:
8  :depth: 2
9
10.. image:: rings.png
11  :width: 800
12  :alt: Submissions and Completion Ring Queues
13
14RTIO provides a framework for doing asynchronous operation chains with event
15driven I/O. This section covers the RTIO API, queues, executor, iodev,
16and common usage patterns with peripheral devices.
17
18RTIO takes a lot of inspiration from Linux's io_uring in its operations and API
19as that API matches up well with hardware transfer queues and descriptions such as
20DMA transfer lists.
21
22Problem
23*******
24
25An application wishing to do complex DMA or interrupt driven operations today
26in Zephyr requires direct knowledge of the hardware and how it works. There is
27no understanding in the DMA API of other Zephyr devices and how they relate.
28
29This means doing complex audio, video, or sensor streaming requires direct
30hardware knowledge or leaky abstractions over DMA controllers. Neither is ideal.
31
32To enable asynchronous operations, especially with DMA, a description of what
33to do rather than direct operations through C and callbacks is needed. Enabling
34DMA features such as channels with priority, and sequences of transfers requires
35more than a simple list of descriptions.
36
37Using DMA and/or interrupt driven I/O shouldn't dictate whether or not the
38call is blocking or not.
39
40Inspiration, introducing io_uring
41*********************************
42
43It's better not to reinvent the wheel (or ring in this case) and io_uring as an
44API from the Linux kernel provides a winning model. In io_uring there are two
45lock-free ring buffers acting as queues shared between the kernel and a userland
46application. One queue for submission entries which may be chained and flushed to
47create concurrent sequential requests. A second queue for completion queue events.
48Only a single syscall is actually required to execute many operations, the
49io_uring_submit call. This call may block the caller when a number of
50operations to wait on is given.
51
52This model maps well to DMA and interrupt driven transfers. A request to do a
53sequence of operations in an asynchronous way directly relates
54to the way hardware typically works with interrupt driven state machines
55potentially involving multiple peripheral IPs like bus and DMA controllers.
56
57Submission Queue
58****************
59
60The submission queue (sq), is the description of the operations
61to perform in concurrent chains.
62
63For example imagine a typical SPI transfer where you wish to write a
64register address to then read from. So the sequence of operations might be...
65
66   1. Chip Select
67   2. Clock Enable
68   3. Write register address into SPI transmit register
69   4. Read from the SPI receive register into a buffer
70   5. Disable clock
71   6. Disable Chip Select
72
73If anything in this chain of operations fails give up. Some of those operations
74can be embodied in a device abstraction that understands a read or write
75implicitly means setup the clock and chip select. The transactional nature of
76the request also needs to be embodied in some manner. Of the operations above
77perhaps the read could be done using DMA as its large enough make sense. That
78requires an understanding of how to setup the device's particular DMA to do so.
79
80The above sequence of operations is embodied in RTIO as chain of
81submission queue entries (sqe). Chaining is done by setting a bitflag in
82an sqe to signify the next sqe must wait on the current one.
83
84Because the chip select and clocking is common to a particular SPI controller
85and device on the bus it is embodied in what RTIO calls an iodev.
86
87Multiple operations against the same iodev are done in the order provided as
88soon as possible. If two operation chains have varying points using the same
89device its possible one chain will have to wait for another to complete.
90
91Completion Queue
92****************
93
94In order to know when a sqe has completed there is a completion
95queue (cq) with completion queue events (cqe). A sqe once completed results in
96a cqe being pushed into the cq. The ordering of cqe may not be the same order of
97sqe. A chain of sqe will however ensure ordering and failure cascading.
98
99Other potential schemes are possible but a completion queue is a well trod
100idea with io_uring and other similar operating system APIs.
101
102Executor
103********
104
105The RTIO executor is a low overhead concurrent I/O task scheduler. It ensures
106certain request flags provide the expected behavior. It takes a list of
107submissions working through them in order. Various flags allow for changing the
108behavior of how submissions are worked through. Flags to form in order chains of
109submissions, transactional sets of submissions, or create multi-shot
110(continuously producing) requests are all possible!
111
112IO Device
113*********
114
115Turning submission queue entries (sqe) into completion queue events (cqe) is the
116job of objects implementing the iodev (IO device) API. This API accepts requests
117in the form of the iodev submit API call. It is the io devices job to work
118through its internal queue of submissions and convert them into completions. In
119effect every io device can be viewed as an independent, event driven actor like
120object, that accepts a never ending queue of I/O like requests. How the iodev
121does this work is up to the author of the iodev, perhaps the entire queue of
122operations can be converted to a set of DMA transfer descriptors, meaning the
123hardware does almost all of the real work.
124
125Cancellation
126************
127
128Canceling an already queued operation is possible but not guaranteed. If the
129SQE has not yet started, it's likely that a call to :c:func:`rtio_sqe_cancel`
130will remove the SQE and never run it. If, however, the SQE already started
131running, the cancel request will be ignored.
132
133Memory pools
134************
135
136In some cases requests to read may not know how much data will be produced.
137Alternatively, a reader might be handling data from multiple io devices where
138the frequency of the data is unpredictable. In these cases it may be wasteful
139to bind memory to in flight read requests. Instead with memory pools the memory
140to read into is left to the iodev to allocate from a memory pool associated with
141the RTIO context that the read was associated with. To create such an RTIO
142context the :c:macro:`RTIO_DEFINE_WITH_MEMPOOL` can be used. It allows creating
143an RTIO context with a dedicated pool of "memory blocks" which can be consumed by
144the iodev. Below is a snippet setting up the RTIO context with a memory pool.
145The memory pool has 128 blocks, each block has the size of 16 bytes, and the data
146is 4 byte aligned.
147
148.. code-block:: C
149
150  #include <zephyr/rtio/rtio.h>
151
152  #define SQ_SIZE       4
153  #define CQ_SIZE       4
154  #define MEM_BLK_COUNT 128
155  #define MEM_BLK_SIZE  16
156  #define MEM_BLK_ALIGN 4
157
158  RTIO_DEFINE_WITH_MEMPOOL(rtio_context,
159      SQ_SIZE, CQ_SIZE, MEM_BLK_COUNT, MEM_BLK_SIZE, MEM_BLK_ALIGN);
160
161When a read is needed, the caller simply needs to replace the call
162:c:func:`rtio_sqe_prep_read` (which takes a pointer to a buffer and a length)
163with a call to :c:func:`rtio_sqe_prep_read_with_pool`. The iodev requires
164only a small change which works with both pre-allocated data buffers as well as
165the mempool. When the read is ready, instead of getting the buffers directly
166from the :c:struct:`rtio_iodev_sqe`, the iodev should get the buffer and count
167by calling :c:func:`rtio_sqe_rx_buf` like so:
168
169.. code-block:: C
170
171  uint8_t *buf;
172  uint32_t buf_len;
173  int rc = rtio_sqe_rx_buff(iodev_sqe, MIN_BUF_LEN, DESIRED_BUF_LEN, &buf, &buf_len);
174
175  if (rc != 0) {
176    LOG_ERR("Failed to get buffer of at least %u bytes", MIN_BUF_LEN);
177    return;
178  }
179
180Finally, the consumer will be able to access the allocated buffer via
181:c:func:`rtio_cqe_get_mempool_buffer`.
182
183.. code-block:: C
184
185  uint8_t *buf;
186  uint32_t buf_len;
187  int rc = rtio_cqe_get_mempool_buffer(&rtio_context, &cqe, &buf, &buf_len);
188
189  if (rc != 0) {
190    LOG_ERR("Failed to get mempool buffer");
191    return rc;
192  }
193
194  /* Release the cqe events (note that the buffer is not released yet */
195  rtio_cqe_release_all(&rtio_context);
196
197  /* Do something with the memory */
198
199  /* Release the mempool buffer */
200  rtio_release_buffer(&rtio_context, buf);
201
202When to Use
203***********
204
205RTIO is useful in cases where concurrent or batch like I/O flows are useful.
206
207From the driver/hardware perspective the API enables batching of I/O requests, potentially in an optimal way.
208Many requests to the same SPI peripheral for example might be translated to hardware command queues or DMA transfer
209descriptors entirely. Meaning the hardware can potentially do more than ever.
210
211There is a small cost to each RTIO context and iodev. This cost could be weighed
212against using a thread for each concurrent I/O operation or custom queues and
213threads per peripheral. RTIO is much lower cost than that.
214
215Supported Buses
216***************
217
218To check if your bus supports RTIO natively, you can check the driver API implementation, if the
219driver implements the ``iodev_submit`` function of the bus API, then RTIO is supported. If the
220driver doesn't support the RTIO APIs, it will set the submit function to
221``i2c_iodev_submit_fallback``.
222
223I2C buses have a default implementation which allows apps to leverage the RTIO work queue while
224vendors implement the submit function. With this queue, any I2C bus driver that does not implement
225the ``iodev_submit`` function will defer to a work item which will perform a blocking I2C
226transaction. To change the pool size, set a different value to
227:kconfig:option:`CONFIG_RTIO_WORKQ_POOL_ITEMS`.
228
229API Reference
230*************
231
232.. doxygengroup:: rtio
233