1.. _net_pkt_interface:
2
3Packet Management
4#################
5
6.. contents::
7    :local:
8    :depth: 2
9
10Overview
11********
12
13Network packets are the main data the networking stack manipulates.
14Such data is represented through the net_pkt structure which provides
15a means to hold the packet, write and read it, as well as necessary
16metadata for the core to hold important information. Such an object is
17called net_pkt in this document.
18
19The data structure and the whole API around it are defined in
20:zephyr_file:`include/zephyr/net/net_pkt.h`.
21
22Architectural notes
23===================
24
25There are two network packets flows within the stack, **TX** for the
26transmission path, and **RX** for the reception one. In both paths,
27each net_pkt is written and read from the beginning to the end, or
28more specifically from the headers to the payload.
29
30
31Memory management
32*****************
33
34Allocation
35==========
36
37All net_pkt objects come from a pre-defined pool of struct net_pkt.
38Such pool is defined via
39
40.. code-block:: c
41
42    NET_PKT_SLAB_DEFINE(name, count)
43
44Note, however, one will rarely have to use it, as the core provides
45already two pools, one for the TX path and one for the RX path.
46
47Allocating a raw net_pkt can be done through:
48
49.. code-block:: c
50
51    pkt = net_pkt_alloc(timeout);
52
53However, by its nature, a raw net_pkt is useless without a buffer and
54needs various metadata information to become relevant as well.  It
55requires at least to get the network interface it is meant to be sent
56through or through which it was received. As this is a very common
57operation, a helper exist:
58
59.. code-block:: c
60
61    pkt = net_pkt_alloc_on_iface(iface, timeout);
62
63A more complete allocator exists, where both the net_pkt and its buffer
64can be allocated at once:
65
66.. code-block:: c
67
68    pkt = net_pkt_alloc_with_buffer(iface, size, family, proto, timeout);
69
70See below how the buffer is allocated.
71
72
73Buffer allocation
74=================
75
76The net_pkt object does not define its own buffer, but instead uses an
77existing object for this: :c:struct:`net_buf`. (See
78:ref:`net_buf_interface` for more information). However, it mostly
79hides the usage of such a buffer because net_pkt brings network
80awareness to buffer allocation and, as we will see later, its
81operation too.
82
83To allocate a buffer, a net_pkt needs to have at least its network
84interface set. This works if the family of the packet is unknown at
85the time of buffer allocation. Then one could do:
86
87.. code-block:: c
88
89    net_pkt_alloc_buffer(pkt, size, proto, timeout);
90
91Where proto could be 0 if unknown (there is no IPPROTO_UNSPEC).
92
93As seen previously, the net_pkt and its buffer can be allocated at
94once via :c:func:`net_pkt_alloc_with_buffer`. It is actually the most
95widely used allocator.
96
97The network interface, the family, and the protocol of the packet are
98used by the buffer allocation to determine if the requested size can
99be allocated.  Indeed, the allocator will use the network interface to
100know the MTU and then the family and protocol for the headers space
101(if only these 2 are specified).  If the whole fits within the MTU,
102the allocated space will be of the requested size plus, eventually,
103the headers space. If there is insufficient MTU space, the requested
104size will be shrunk so the possible headers space and new size will
105fit within the MTU.
106
107For instance, on an Ethernet network interface, with an MTU of 1500
108bytes:
109
110.. code-block:: c
111
112    pkt = net_pkt_alloc_with_buffer(iface, 800, AF_INET4, IPPROTO_UDP, K_FOREVER);
113
114will successfully allocate 800 + 20 + 8 bytes of buffer for the new
115net_pkt where:
116
117.. code-block:: c
118
119    pkt = net_pkt_alloc_with_buffer(iface, 1600, AF_INET4, IPPROTO_UDP, K_FOREVER);
120
121will successfully allocate 1500 bytes, and where 20 + 8 bytes (IPv4 +
122UDP headers) will not be used for the payload.
123
124On the receiving side, when the family and protocol are not known:
125
126.. code-block:: c
127
128    pkt = net_pkt_rx_alloc_with_buffer(iface, 800, AF_UNSPEC, 0, K_FOREVER);
129
130will allocate 800 bytes and no extra header space.
131But a:
132
133.. code-block:: c
134
135    pkt = net_pkt_rx_alloc_with_buffer(iface, 1600, AF_UNSPEC, 0, K_FOREVER);
136
137will allocate 1514 bytes, the MTU + Ethernet header space.
138
139One can increase the amount of buffer space allocated by calling
140:c:func:`net_pkt_alloc_buffer`, as it will take into account the
141existing buffer. It will also account for the header space if
142net_pkt's family is a valid one, as well as the proto parameter. In
143that case, the newly allocated buffer space will be appended to the
144existing one, and not inserted in the front. Note however such a use
145case is rather limited.  Usually, one should know from the start how
146much size should be requested.
147
148
149Deallocation
150============
151
152Each net_pkt is reference counted. At allocation, the reference is set
153to 1.  The reference count can be incremented with
154:c:func:`net_pkt_ref()` or decremented with
155:c:func:`net_pkt_unref()`. When the count drops to zero the buffer is
156also un-referenced and net_pkt is automatically placed back into the
157free net_pkt_slabs
158
159If net_pkt's buffer is needed even after net_pkt deallocation, one
160will need to reference once more all the chain of net_buf before
161calling last net_pkt_unref. See :ref:`net_buf_interface` for more
162information.
163
164
165Operations
166**********
167
168There are two ways to access the net_pkt buffer, explained in the
169following sections: basic read/write access and data access, the
170latter being the preferred way.
171
172Read and Write access
173=====================
174
175As said earlier, though net_pkt uses net_buf for its buffer, it
176provides its own API to access it. Indeed, a network packet might be
177scattered over a chain of net_buf objects, the functions provided by
178net_buf are then limited for such case.  Instead, net_pkt provides
179functions which hide all the complexity of potential non-contiguous
180access.
181
182Data movement into the buffer is made through a cursor maintained
183within each net_pkt.  All read/write operations affect this
184cursor. Note as well that read or write functions are strict on their
185length parameters: if it cannot r/w the given length it will
186fail. Length is not interpreted as an upper limit, it is instead the
187exact amount of data that must be read or written.
188
189As there are two paths, TX and RX, there are two access modes: write
190and overwrite.  This might sound a bit unusual, but is in fact simple
191and provides flexibility.
192
193In write mode, whatever is written in the buffer affects the length of
194actual data present in the buffer. Buffer length should not be
195confused with the buffer size which is a limit any mode cannot pass.
196In overwrite mode then, whatever is written must happen on valid data,
197and will not affect the buffer length. By default, a newly allocated
198net_pkt is on write mode, and its cursor points to the beginning of
199its buffer.
200
201Let's see now, step by step, the functions and how they behave
202depending on the mode.
203
204When freshly allocated with a buffer of 500 bytes, a net_pkt has 0
205length, which means no valid data is in its buffer. One could verify
206this by:
207
208.. code-block:: c
209
210    len = net_pkt_get_len(pkt);
211
212Now, let's write 8 bytes:
213
214.. code-block:: c
215
216    net_pkt_write(pkt, data, 8);
217
218The buffer length is now 8 bytes.
219There are various helpers to write a byte, or big endian uint16_t, uint32_t.
220
221.. code-block:: c
222
223    net_pkt_write_u8(pkt, &foo);
224    net_pkt_write_be16(pkt, &ba);
225    net_pkt_write_be32(pkt, &bar);
226
227Logically, net_pkt's length is now 15. But if we try to read at this
228point, it will fail because there is nothing to read at the cursor
229where we are at in the net_pkt. It is possible, while in write mode,
230to read what has been already written by resetting the cursor of the
231net_pkt. For instance:
232
233.. code-block:: c
234
235    net_pkt_cursor_init(pkt);
236    net_pkt_read(pkt, data, 15);
237
238This will reset the cursor of the pkt to the beginning of the buffer
239and then let you read the actual 15 bytes present. The cursor is then
240again pointing at the end of the buffer.
241
242To set a large area with the same byte, a memset function is provided:
243
244.. code-block:: c
245
246    net_pkt_memset(pkt, 0, 5);
247
248Our net_pkt has now a length of 20 bytes.
249
250Switching between modes can be achieved via
251:c:func:`net_pkt_set_overwrite` function. It is possible to switch
252mode back and forth at any time.  The net_pkt will be set to overwrite
253and its cursor reset:
254
255.. code-block:: c
256
257    net_pkt_set_overwrite(pkt, true);
258    net_pkt_cursor_init(pkt);
259
260Now the same operators can be used, but it will be limited to the
261existing data in the buffer, i.e. 20 bytes.
262
263If it is necessary to know how much space is available in the net_pkt
264call:
265
266.. code-block:: c
267
268    net_pkt_available_buffer(pkt);
269
270Or, if headers space needs to be accounted for, call:
271
272.. code-block:: c
273
274    net_pkt_available_payload_buffer(pkt, proto);
275
276If you want to place the cursor at a known position use the function
277:c:func:`net_pkt_skip`.  For example, to go after the IP header, use:
278
279.. code-block:: c
280
281    net_pkt_cursor_init(pkt);
282    net_pkt_skip(pkt, net_pkt_ip_header_len(pkt));
283
284
285Data access
286===========
287
288Though the API shown previously is rather simple, it involves always
289copying things to and from the net_pkt buffer. In many occasions, it
290is more relevant to access the information stored in the buffer
291contiguously, especially with network packets which embed headers.
292
293These headers are, most of the time, a known fixed set of bytes. It is
294then more natural to have a structure representing a certain type of
295header.  In addition to this, if it is known the header size appears
296in a contiguous area of the buffer, it will be way more efficient to
297cast the actual position in the buffer to the type of header. Either
298for reading or writing the fields of such header, accessing it
299directly will save memory.
300
301Net pkt comes with a dedicated API for this, built on top of the
302previously described API. It is able to handle both contiguous and
303non-contiguous access transparently.
304
305There are two macros used to define a data access descriptor:
306:c:macro:`NET_PKT_DATA_ACCESS_DEFINE` when it is not possible to
307tell if the data will be in a contiguous area, and
308:c:macro:`NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE` when
309it is guaranteed the data is in a contiguous area.
310
311Let's take the example of IP and UDP. Both IPv4 and IPv6 headers are
312always found at the beginning of the packet and are small enough to
313fit in a net_buf of 128 bytes (for instance, though 64 bytes could be
314chosen).
315
316.. code-block:: c
317
318    NET_PKT_DATA_ACCESS_CONTIGUOUS_DEFINE(ipv4_access, struct net_ipv4_hdr);
319    struct net_ipv4_hdr *ipv4_hdr;
320
321    ipv4_hdr = (struct net_ipv4_hdr *)net_pkt_get_data(pkt, &ipv4_access);
322
323It would be the same for struct net_ipv4_hdr. For a UDP header it
324is likely not to be in a contiguous area in IPv6
325for instance so:
326
327.. code-block:: c
328
329    NET_PKT_DATA_ACCESS_DEFINE(udp_access, struct net_udp_hdr);
330    struct net_udp_hdr *udp_hdr;
331
332    udp_hdr = (struct net_udp_hdr *)net_pkt_get_data(pkt, &udp_access);
333
334At this point, the cursor of the net_pkt points at the beginning of
335the requested data. On the RX path, these headers will be read but not
336modified so to proceed further the cursor needs to advance past the
337data. There is a function dedicated for this:
338
339.. code-block:: c
340
341    net_pkt_acknowledge_data(pkt, &ipv4_access);
342
343On the TX path, however, the header fields have been modified. In such
344a case:
345
346.. code-block:: c
347
348    net_pkt_set_data(pkt, &ipv4_access);
349
350If the data are in a contiguous area, it will advance the cursor
351relevantly. If not, it will write the data and the cursor will be
352updated. Note that :c:func:`net_pkt_set_data` could be used in the RX
353path as well, but it is slightly faster to use
354:c:func:`net_pkt_acknowledge_data` as this one does not care about
355contiguity at all, it just advances the cursor via
356:c:func:`net_pkt_skip` directly.
357
358
359API Reference
360*************
361
362.. doxygengroup:: net_pkt
363