1.. _workqueues_v2:
2
3Workqueue Threads
4#################
5
6.. contents::
7    :local:
8    :depth: 1
9
10A :dfn:`workqueue` is a kernel object that uses a dedicated thread to process
11work items in a first in, first out manner. Each work item is processed by
12calling the function specified by the work item. A workqueue is typically
13used by an ISR or a high-priority thread to offload non-urgent processing
14to a lower-priority thread so it does not impact time-sensitive processing.
15
16Any number of workqueues can be defined (limited only by available RAM). Each
17workqueue is referenced by its memory address.
18
19A workqueue has the following key properties:
20
21* A **queue** of work items that have been added, but not yet processed.
22
23* A **thread** that processes the work items in the queue. The priority of the
24  thread is configurable, allowing it to be either cooperative or preemptive
25  as required.
26
27Regardless of workqueue thread priority the workqueue thread will yield
28between each submitted work item, to prevent a cooperative workqueue from
29starving other threads.
30
31A workqueue must be initialized before it can be used. This sets its queue to
32empty and spawns the workqueue's thread.  The thread runs forever, but sleeps
33when no work items are available.
34
35.. note::
36   The behavior described here is changed from the Zephyr workqueue
37   implementation used prior to release 2.6.  Among the changes are:
38
39   * Precise tracking of the status of cancelled work items, so that the
40     caller need not be concerned that an item may be processing when the
41     cancellation returns.  Checking of return values on cancellation is still
42     required.
43   * Direct submission of delayable work items to the queue with
44     :c:macro:`K_NO_WAIT` rather than always going through the timeout API,
45     which could introduce delays.
46   * The ability to wait until a work item has completed or a queue has been
47     drained.
48   * Finer control of behavior when scheduling a delayable work item,
49     specifically allowing a previous deadline to remain unchanged when a work
50     item is scheduled again.
51   * Safe handling of work item resubmission when the item is being processed
52     on another workqueue.
53
54   Using the return values of :c:func:`k_work_busy_get()` or
55   :c:func:`k_work_is_pending()`, or measurements of remaining time until
56   delayable work is scheduled, should be avoided to prevent race conditions
57   of the type observed with the previous implementation.  See also `Workqueue
58   Best Practices`_.
59
60Work Item Lifecycle
61********************
62
63Any number of **work items** can be defined. Each work item is referenced
64by its memory address.
65
66A work item is assigned a **handler function**, which is the function
67executed by the workqueue's thread when the work item is processed. This
68function accepts a single argument, which is the address of the work item
69itself.  The work item also maintains information about its status.
70
71A work item must be initialized before it can be used. This records the work
72item's handler function and marks it as not pending.
73
74A work item may be **queued** (:c:enumerator:`K_WORK_QUEUED`) by submitting it to a
75workqueue by an ISR or a thread.  Submitting a work item appends the work item
76to the workqueue's queue.  Once the workqueue's thread has processed all of
77the preceding work items in its queue the thread will remove the next work
78item from the queue and invoke the work item's handler function. Depending on
79the scheduling priority of the workqueue's thread, and the work required by
80other items in the queue, a queued work item may be processed quickly or it
81may remain in the queue for an extended period of time.
82
83A delayable work item may be **scheduled** (:c:enumerator:`K_WORK_DELAYED`) to a
84workqueue; see `Delayable Work`_.
85
86A work item will be **running** (:c:enumerator:`K_WORK_RUNNING`) when it is running
87on a work queue, and may also be **canceling** (:c:enumerator:`K_WORK_CANCELING`)
88if it started running before a thread has requested that it be cancelled.
89
90A work item can be in multiple states; for example it can be:
91
92* running on a queue;
93* marked canceling (because a thread used :c:func:`k_work_cancel_sync()` to
94  wait until the work item completed);
95* queued to run again on the same queue;
96* scheduled to be submitted to a (possibly different) queue
97
98*all simultaneously*.  A work item that is in any of these states is **pending**
99(:c:func:`k_work_is_pending()`) or **busy** (:c:func:`k_work_busy_get()`).
100
101A handler function can use any kernel API available to threads. However,
102operations that are potentially blocking (e.g. taking a semaphore) must be
103used with care, since the workqueue cannot process subsequent work items in
104its queue until the handler function finishes executing.
105
106The single argument that is passed to a handler function can be ignored if it
107is not required. If the handler function requires additional information about
108the work it is to perform, the work item can be embedded in a larger data
109structure. The handler function can then use the argument value to compute the
110address of the enclosing data structure with :c:macro:`CONTAINER_OF`, and
111thereby obtain access to the additional information it needs.
112
113A work item is typically initialized once and then submitted to a specific
114workqueue whenever work needs to be performed. If an ISR or a thread attempts
115to submit a work item that is already queued the work item is not affected;
116the work item remains in its current place in the workqueue's queue, and
117the work is only performed once.
118
119A handler function is permitted to re-submit its work item argument
120to the workqueue, since the work item is no longer queued at that time.
121This allows the handler to execute work in stages, without unduly delaying
122the processing of other work items in the workqueue's queue.
123
124.. important::
125    A pending work item *must not* be altered until the item has been processed
126    by the workqueue thread. This means a work item must not be re-initialized
127    while it is busy. Furthermore, any additional information the work item's
128    handler function needs to perform its work must not be altered until
129    the handler function has finished executing.
130
131.. _k_delayable_work:
132
133Delayable Work
134**************
135
136An ISR or a thread may need to schedule a work item that is to be processed
137only after a specified period of time, rather than immediately. This can be
138done by **scheduling** a **delayable work item** to be submitted to a
139workqueue at a future time.
140
141A delayable work item contains a standard work item but adds fields that
142record when and where the item should be submitted.
143
144A delayable work item is initialized and scheduled to a workqueue in a similar
145manner to a standard work item, although different kernel APIs are used.  When
146the schedule request is made the kernel initiates a timeout mechanism that is
147triggered after the specified delay has elapsed. Once the timeout occurs the
148kernel submits the work item to the specified workqueue, where it remains
149queued until it is processed in the standard manner.
150
151Note that work handler used for delayable still receives a pointer to the
152underlying non-delayable work structure, which is not publicly accessible from
153:c:struct:`k_work_delayable`.  To get access to an object that contains the
154delayable work object use this idiom:
155
156.. code-block:: c
157
158   static void work_handler(struct k_work *work)
159   {
160           struct k_work_delayable *dwork = k_work_delayable_from_work(work);
161           struct work_context *ctx = CONTAINER_OF(dwork, struct work_context,
162	                                           timed_work);
163           ...
164
165
166Triggered Work
167**************
168
169The :c:func:`k_work_poll_submit` interface schedules a triggered work
170item in response to a **poll event** (see :ref:`polling_v2`), that will
171call a user-defined function when a monitored resource becomes available
172or poll signal is raised, or a timeout occurs.
173In contrast to :c:func:`k_poll`, the triggered work does not require
174a dedicated thread waiting or actively polling for a poll event.
175
176A triggered work item is a standard work item that has the following
177added properties:
178
179* A pointer to an array of poll events that will trigger work item
180  submissions to the workqueue
181
182* A size of the array containing poll events.
183
184A triggered work item is initialized and submitted to a workqueue in a similar
185manner to a standard work item, although dedicated kernel APIs are used.
186When a submit request is made, the kernel begins observing kernel objects
187specified by the poll events. Once at least one of the observed kernel
188object's changes state, the work item is submitted to the specified workqueue,
189where it remains queued until it is processed in the standard manner.
190
191.. important::
192    The triggered work item as well as the referenced array of poll events
193    have to be valid and cannot be modified for a complete triggered work
194    item lifecycle, from submission to work item execution or cancellation.
195
196An ISR or a thread may **cancel** a triggered work item it has submitted
197as long as it is still waiting for a poll event. In such case, the kernel
198stops waiting for attached poll events and the specified work is not executed.
199Otherwise the cancellation cannot be performed.
200
201System Workqueue
202*****************
203
204The kernel defines a workqueue known as the *system workqueue*, which is
205available to any application or kernel code that requires workqueue support.
206The system workqueue is optional, and only exists if the application makes
207use of it.
208
209.. important::
210    Additional workqueues should only be defined when it is not possible
211    to submit new work items to the system workqueue, since each new workqueue
212    incurs a significant cost in memory footprint. A new workqueue can be
213    justified if it is not possible for its work items to co-exist with
214    existing system workqueue work items without an unacceptable impact;
215    for example, if the new work items perform blocking operations that
216    would delay other system workqueue processing to an unacceptable degree.
217
218How to Use Workqueues
219*********************
220
221Defining and Controlling a Workqueue
222====================================
223
224A workqueue is defined using a variable of type :c:struct:`k_work_q`.
225The workqueue is initialized by defining the stack area used by its
226thread, initializing the :c:struct:`k_work_q`, either zeroing its
227memory or calling :c:func:`k_work_queue_init`, and then calling
228:c:func:`k_work_queue_start`. The stack area must be defined using
229:c:macro:`K_THREAD_STACK_DEFINE` to ensure it is properly set up in
230memory.
231
232The following code defines and initializes a workqueue:
233
234.. code-block:: c
235
236    #define MY_STACK_SIZE 512
237    #define MY_PRIORITY 5
238
239    K_THREAD_STACK_DEFINE(my_stack_area, MY_STACK_SIZE);
240
241    struct k_work_q my_work_q;
242
243    k_work_queue_init(&my_work_q);
244
245    k_work_queue_start(&my_work_q, my_stack_area,
246                       K_THREAD_STACK_SIZEOF(my_stack_area), MY_PRIORITY,
247		       NULL);
248
249In addition the queue identity and certain behavior related to thread
250rescheduling can be controlled by the optional final parameter; see
251:c:func:`k_work_queue_start()` for details.
252
253The following API can be used to interact with a workqueue:
254
255* :c:func:`k_work_queue_drain()` can be used to block the caller until the
256  work queue has no items left.  Work items resubmitted from the workqueue
257  thread are accepted while a queue is draining, but work items from any other
258  thread or ISR are rejected.  The restriction on submitting more work can be
259  extended past the completion of the drain operation in order to allow the
260  blocking thread to perform additional work while the queue is "plugged".
261  Note that draining a queue has no effect on scheduling or processing
262  delayable items, but if the queue is plugged and the deadline expires the
263  item will silently fail to be submitted.
264* :c:func:`k_work_queue_unplug()` removes any previous block on submission to
265  the queue due to a previous drain operation.
266
267Submitting a Work Item
268======================
269
270A work item is defined using a variable of type :c:struct:`k_work`.  It must
271be initialized by calling :c:func:`k_work_init`, unless it is defined using
272:c:macro:`K_WORK_DEFINE` in which case initialization is performed at
273compile-time.
274
275An initialized work item can be submitted to the system workqueue by
276calling :c:func:`k_work_submit`, or to a specified workqueue by
277calling :c:func:`k_work_submit_to_queue`.
278
279The following code demonstrates how an ISR can offload the printing
280of error messages to the system workqueue. Note that if the ISR attempts
281to resubmit the work item while it is still queued, the work item is left
282unchanged and the associated error message will not be printed.
283
284.. code-block:: c
285
286    struct device_info {
287        struct k_work work;
288        char name[16]
289    } my_device;
290
291    void my_isr(void *arg)
292    {
293        ...
294        if (error detected) {
295            k_work_submit(&my_device.work);
296	}
297	...
298    }
299
300    void print_error(struct k_work *item)
301    {
302        struct device_info *the_device =
303            CONTAINER_OF(item, struct device_info, work);
304        printk("Got error on device %s\n", the_device->name);
305    }
306
307    /* initialize name info for a device */
308    strcpy(my_device.name, "FOO_dev");
309
310    /* initialize work item for printing device's error messages */
311    k_work_init(&my_device.work, print_error);
312
313    /* install my_isr() as interrupt handler for the device (not shown) */
314    ...
315
316
317The following API can be used to check the status of or synchronize with the
318work item:
319
320* :c:func:`k_work_busy_get()` returns a snapshot of flags indicating work item
321  state.  A zero value indicates the work is not scheduled, submitted, being
322  executed, or otherwise still being referenced by the workqueue
323  infrastructure.
324* :c:func:`k_work_is_pending()` is a helper that indicates ``true`` if and only
325  if the work is scheduled, queued, or running.
326* :c:func:`k_work_flush()` may be invoked from threads to block until the work
327  item has completed.  It returns immediately if the work is not pending.
328* :c:func:`k_work_cancel()` attempts to prevent the work item from being
329  executed.  This may or may not be successful. This is safe to invoke
330  from ISRs.
331* :c:func:`k_work_cancel_sync()` may be invoked from threads to block until
332  the work completes; it will return immediately if the cancellation was
333  successful or not necessary (the work wasn't submitted or running).  This
334  can be used after :c:func:`k_work_cancel()` is invoked (from an ISR)` to
335  confirm completion of an ISR-initiated cancellation.
336
337Scheduling a Delayable Work Item
338================================
339
340A delayable work item is defined using a variable of type
341:c:struct:`k_work_delayable`. It must be initialized by calling
342:c:func:`k_work_init_delayable`.
343
344For delayed work there are two common use cases, depending on whether a
345deadline should be extended if a new event occurs. An example is collecting
346data that comes in asynchronously, e.g. characters from a UART associated with
347a keyboard.  There are two APIs that submit work after a delay:
348
349* :c:func:`k_work_schedule()` (or :c:func:`k_work_schedule_for_queue()`)
350  schedules work to be executed at a specific time or after a delay.  Further
351  attempts to schedule the same item with this API before the delay completes
352  will not change the time at which the item will be submitted to its queue.
353  Use this if the policy is to keep collecting data until a specified delay
354  since the **first** unprocessed data was received;
355* :c:func:`k_work_reschedule()` (or :c:func:`k_work_reschedule_for_queue()`)
356  unconditionally sets the deadline for the work, replacing any previous
357  incomplete delay and changing the destination queue if necessary.  Use this
358  if the policy is to keep collecting data until a specified delay since the
359  **last** unprocessed data was received.
360
361If the work item is not scheduled both APIs behave the same.  If
362:c:macro:`K_NO_WAIT` is specified as the delay the behavior is as if the item
363was immediately submitted directly to the target queue, without waiting for a
364minimal timeout (unless :c:func:`k_work_schedule()` is used and a previous
365delay has not completed).
366
367Both also have variants that allow
368control of the queue used for submission.
369
370The helper function :c:func:`k_work_delayable_from_work()` can be used to get
371a pointer to the containing :c:struct:`k_work_delayable` from a pointer to
372:c:struct:`k_work` that is passed to a work handler function.
373
374The following additional API can be used to check the status of or synchronize
375with the work item:
376
377* :c:func:`k_work_delayable_busy_get()` is the analog to :c:func:`k_work_busy_get()`
378  for delayable work.
379* :c:func:`k_work_delayable_is_pending()` is the analog to
380  :c:func:`k_work_is_pending()` for delayable work.
381* :c:func:`k_work_flush_delayable()` is the analog to :c:func:`k_work_flush()`
382  for delayable work.
383* :c:func:`k_work_cancel_delayable()` is the analog to
384  :c:func:`k_work_cancel()` for delayable work; similarly with
385  :c:func:`k_work_cancel_delayable_sync()`.
386
387Synchronizing with Work Items
388=============================
389
390While the state of both regular and delayable work items can be determined
391from any context using :c:func:`k_work_busy_get()` and
392:c:func:`k_work_delayable_busy_get()` some use cases require synchronizing
393with work items after they've been submitted.  :c:func:`k_work_flush()`,
394:c:func:`k_work_cancel_sync()`, and :c:func:`k_work_cancel_delayable_sync()`
395can be invoked from thread context to wait until the requested state has been
396reached.
397
398These APIs must be provided with a :c:struct:`k_work_sync` object that has no
399application-inspectable components but is needed to provide the
400synchronization objects.  These objects should not be allocated on a stack if
401the code is expected to work on architectures with
402:kconfig:option:`CONFIG_KERNEL_COHERENCE`.
403
404Workqueue Best Practices
405************************
406
407Avoid Race Conditions
408=====================
409
410Sometimes the data a work item must process is naturally thread-safe, for
411example when it's put into a :c:struct:`k_queue` by some thread and processed
412in the work thread. More often external synchronization is required to avoid
413data races: cases where the work thread might inspect or manipulate shared
414state that's being accessed by another thread or interrupt.  Such state might
415be a flag indicating that work needs to be done, or a shared object that is
416filled by an ISR or thread and read by the work handler.
417
418For simple flags :ref:`atomic_v2` may be sufficient.  In other cases spin
419locks (:c:struct:`k_spinlock`) or thread-aware locks (:c:struct:`k_sem`,
420:c:struct:`k_mutex` , ...) may be used to ensure data races don't occur.
421
422If the selected lock mechanism can :ref:`api_term_sleep` then allowing the
423work thread to sleep will starve other work queue items, which may need to
424make progress in order to get the lock released. Work handlers should try to
425take the lock with its no-wait path. For example:
426
427.. code-block:: c
428
429   static void work_handler(struct work *work)
430   {
431           struct work_context *parent = CONTAINER_OF(work, struct work_context,
432	                                              work_item);
433
434           if (k_mutex_lock(&parent->lock, K_NO_WAIT) != 0) {
435                   /* NB: Submit will fail if the work item is being cancelled. */
436                   (void)k_work_submit(work);
437		   return;
438	   }
439
440	   /* do stuff under lock */
441	   k_mutex_unlock(&parent->lock);
442	   /* do stuff without lock */
443   }
444
445Be aware that if the lock is held by a thread with a lower priority than the
446work queue the resubmission may starve the thread that would release the lock,
447causing the application to fail.  Where the idiom above is required a
448delayable work item is preferred, and the work should be (re-)scheduled with a
449non-zero delay to allow the thread holding the lock to make progress.
450
451Note that submitting from the work handler can fail if the work item had been
452cancelled.  Generally this is acceptable, since the cancellation will complete
453once the handler finishes.  If it is not, the code above must take other steps
454to notify the application that the work could not be performed.
455
456Work items in isolation are self-locking, so you don't need to hold an
457external lock just to submit or schedule them. Even if you use external state
458protected by such a lock to prevent further resubmission, it's safe to do the
459resubmit as long as you're sure that eventually the item will take its lock
460and check that state to determine whether it should do anything.  Where a
461delayable work item is being rescheduled in its handler due to inability to
462take the lock some other self-locking state, such as an atomic flag set by the
463application/driver when the cancel is initiated, would be required to detect
464the cancellation and avoid the cancelled work item being submitted again after
465the deadline.
466
467Check Return Values
468===================
469
470All work API functions return status of the underlying operation, and in many
471cases it is important to verify that the intended result was obtained.
472
473* Submitting a work item (:c:func:`k_work_submit_to_queue`) can fail if the
474  work is being cancelled or the queue is not accepting new items.  If this
475  happens the work will not be executed, which could cause a subsystem that is
476  animated by work handler activity to become non-responsive.
477* Asynchronous cancellation (:c:func:`k_work_cancel` or
478  :c:func:`k_work_cancel_delayable`) can complete while the work item is still
479  being run by a handler.  Proceeding to manipulate state shared with the work
480  handler will result in data races that can cause failures.
481
482Many race conditions have been present in Zephyr code because the results of
483an operation were not checked.
484
485There may be good reason to believe that a return value indicating that the
486operation did not complete as expected is not a problem.  In those cases the
487code should clearly document this, by (1) casting the return value to ``void``
488to indicate that the result is intentionally ignored, and (2) documenting what
489happens in the unexpected case.  For example:
490
491.. code-block:: c
492
493   /* If this fails, the work handler will check pub->active and
494    * exit without transmitting.
495    */
496   (void)k_work_cancel_delayable(&pub->timer);
497
498However in such a case the following code must still avoid data races, as it
499cannot guarantee that the work thread is not accessing work-related state.
500
501Don't Optimize Prematurely
502==========================
503
504The workqueue API is designed to be safe when invoked from multiple threads
505and interrupts. Attempts to externally inspect a work item's state and make
506decisions based on the result are likely to create new problems.
507
508So when new work comes in, just submit it. Don't attempt to "optimize" by
509checking whether the work item is already submitted by inspecting snapshot
510state with :c:func:`k_work_is_pending` or :c:func:`k_work_busy_get`, or
511checking for a non-zero delay from
512:c:func:`k_work_delayable_remaining_get()`. Those checks are fragile: a "busy"
513indication can be obsolete by the time the test is returned, and a "not-busy"
514indication can also be wrong if work is submitted from multiple contexts, or
515(for delayable work) if the deadline has completed but the work is still in
516queued or running state.
517
518A general best practice is to always maintain in shared state some condition
519that can be checked by the handler to confirm whether there is work to be
520done.  This way you can use the work handler as the standard cleanup path:
521rather than having to deal with cancellation and cleanup at points where items
522are submitted, you may be able to have everything done in the work handler
523itself.
524
525A rare case where you could safely use :c:func:`k_work_is_pending` is as a
526check to avoid invoking :c:func:`k_work_flush` or
527:c:func:`k_work_cancel_sync`, if you are *certain* that nothing else might
528submit the work while you're checking (generally because you're holding a lock
529that prevents access to state used for submission).
530
531Suggested Uses
532**************
533
534Use the system workqueue to defer complex interrupt-related processing from an
535ISR to a shared thread. This allows the interrupt-related processing to be
536done promptly without compromising the system's ability to respond to
537subsequent interrupts, and does not require the application to define and
538manage an additional thread to do the processing.
539
540Configuration Options
541**********************
542
543Related configuration options:
544
545* :kconfig:option:`CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE`
546* :kconfig:option:`CONFIG_SYSTEM_WORKQUEUE_PRIORITY`
547* :kconfig:option:`CONFIG_SYSTEM_WORKQUEUE_NO_YIELD`
548
549API Reference
550**************
551
552.. doxygengroup:: workqueue_apis
553