1.. _kernel_timing:
2
3Kernel Timing
4#############
5
6Zephyr provides a robust and scalable timing framework to enable
7reporting and tracking of timed events from hardware timing sources of
8arbitrary precision.
9
10Time Units
11==========
12
13Kernel time is tracked in several units which are used for different
14purposes.
15
16Real time values, typically specified in milliseconds or microseconds,
17are the default presentation of time to application code.  They have
18the advantages of being universally portable and pervasively
19understood, though they may not match the precision of the underlying
20hardware perfectly.
21
22The kernel presents a "cycle" count via the :c:func:`k_cycle_get_32`
23and :c:func:`k_cycle_get_64` APIs.  The intent is that this counter
24represents the fastest cycle counter that the operating system is able
25to present to the user (for example, a CPU cycle counter) and that the
26read operation is very fast.  The expectation is that very sensitive
27application code might use this in a polling manner to achieve maximal
28precision.  The frequency of this counter is required to be steady
29over time, and is available from
30:c:func:`sys_clock_hw_cycles_per_sec` (which on almost all
31platforms is a runtime constant that evaluates to
32CONFIG_SYS_CLOCK_HW_CYCLES_PER_SEC).
33
34For asynchronous timekeeping, the kernel defines a "ticks" concept.  A
35"tick" is the internal count in which the kernel does all its internal
36uptime and timeout bookkeeping.  Interrupts are expected to be
37delivered on tick boundaries to the extent practical, and no
38fractional ticks are tracked.  The choice of tick rate is configurable
39via :kconfig:option:`CONFIG_SYS_CLOCK_TICKS_PER_SEC`.  Defaults on most
40hardware platforms (ones that support setting arbitrary interrupt
41timeouts) are expected to be in the range of 10 kHz, with software
42emulation platforms and legacy drivers using a more traditional 100 Hz
43value.
44
45Conversion
46----------
47
48Zephyr provides an extensively enumerated conversion library with
49rounding control for all time units.  Any unit of "ms" (milliseconds),
50"us" (microseconds), "tick", or "cyc" can be converted to any other.
51Control of rounding is provided, and each conversion is available in
52"floor" (round down to nearest output unit), "ceil" (round up) and
53"near" (round to nearest).  Finally the output precision can be
54specified as either 32 or 64 bits.
55
56For example: :c:func:`k_ms_to_ticks_ceil32` will convert a
57millisecond input value to the next higher number of ticks, returning
58a result truncated to 32 bits of precision; and
59:c:func:`k_cyc_to_us_floor64` will convert a measured cycle count
60to an elapsed number of microseconds in a full 64 bits of precision.
61See the reference documentation for the full enumeration of conversion
62routines.
63
64On most platforms, where the various counter rates are integral
65multiples of each other and where the output fits within a single
66word, these conversions expand to a 2-4 operation sequence, requiring
67full precision only where actually required and requested.
68
69.. _kernel_timing_uptime:
70
71Uptime
72======
73
74The kernel tracks a system uptime count on behalf of the application.
75This is available at all times via :c:func:`k_uptime_get`, which
76provides an uptime value in milliseconds since system boot.  This is
77expected to be the utility used by most portable application code.
78
79The internal tracking, however, is as a 64 bit integer count of ticks.
80Apps with precise timing requirements (that are willing to do their
81own conversions to portable real time units) may access this with
82:c:func:`k_uptime_ticks`.
83
84Timeouts
85========
86
87The Zephyr kernel provides many APIs with a "timeout" parameter.
88Conceptually, this indicates the time at which an event will occur.
89For example:
90
91* Kernel blocking operations like :c:func:`k_sem_take` or
92  :c:func:`k_queue_get` may provide a timeout after which the
93  routine will return with an error code if no data is available.
94
95* Kernel :c:struct:`k_timer` objects must specify delays for
96  their duration and period.
97
98* The kernel :c:struct:`k_work_delayable` API provides a timeout parameter
99  indicating when a work queue item will be added to the system queue.
100
101All these values are specified using a :c:type:`k_timeout_t` value.  This is
102an opaque struct type that must be initialized using one of a family
103of kernel timeout macros.  The most common, :c:macro:`K_MSEC`, defines
104a time in milliseconds after the current time.
105
106What is meant by "current time" for relative timeouts depends on the context:
107
108* When scheduling a relative timeout from within a timeout callback (e.g. from
109  within the expiry function passed to :c:func:`k_timer_init` or the work handler
110  passed to :c:func:`k_work_init_delayable`), "current time" is the exact time at
111  which the currently firing timeout was originally scheduled even if the "real
112  time" will already have advanced. This is to ensure that timers scheduled from
113  within another timer's callback will always be calculated with a precise offset
114  to the firing timer. It is thereby possible to fire at regular intervals without
115  introducing systematic clock drift over time.
116
117* When scheduling a timeout from application context, "current time" means the
118  value returned by :c:func:`k_uptime_ticks` at the time at which the kernel
119  receives the timeout value.
120
121Other options for timeout initialization follow the unit conventions
122described above: :c:macro:`K_NSEC()`, :c:macro:`K_USEC`, :c:macro:`K_TICKS` and
123:c:macro:`K_CYC()` specify timeout values that will expire after specified
124numbers of nanoseconds, microseconds, ticks and cycles, respectively.
125
126Precision of :c:type:`k_timeout_t` values is configurable, with the default
127being 32 bits.  Large uptime counts in non-tick units will experience
128complicated rollover semantics, so it is expected that
129timing-sensitive applications with long uptimes will be configured to
130use a 64 bit timeout type.
131
132Finally, it is possible to specify timeouts as absolute times since
133system boot.  A timeout initialized with :c:macro:`K_TIMEOUT_ABS_MS`
134indicates a timeout that will expire after the system uptime reaches
135the specified value.  There are likewise nanosecond, microsecond,
136cycles and ticks variants of this API.
137
138Timing Internals
139================
140
141Timeout Queue
142-------------
143
144All Zephyr :c:type:`k_timeout_t` events specified using the API above are
145managed in a single, global queue of events.  Each event is stored in
146a double-linked list, with an attendant delta count in ticks from the
147previous event.  The action to take on an event is specified as a
148callback function pointer provided by the subsystem requesting the
149event, along with a :c:struct:`_timeout` tracking struct that is
150expected to be embedded within subsystem-defined data structures (for
151example: a :c:struct:`wait_q` struct, or a :c:type:`k_tid_t` thread struct).
152
153Note that all variant units passed via a :c:type:`k_timeout_t` are converted
154to ticks once on insertion into the list.  There no
155multiple-conversion steps internal to the kernel, so precision is
156guaranteed at the tick level no matter how many events exist or how
157long a timeout might be.
158
159Note that the list structure means that the CPU work involved in
160managing large numbers of timeouts is quadratic in the number of
161active timeouts.  The API design of the timeout queue was intended to
162permit a more scalable backend data structure, but no such
163implementation exists currently.
164
165Timer Drivers
166-------------
167
168Kernel timing at the tick level is driven by a timer driver with a
169comparatively simple API.
170
171* The driver is expected to be able to "announce" new ticks to the
172  kernel via the :c:func:`sys_clock_announce` call, which passes an integer
173  number of ticks that have elapsed since the last announce call (or
174  system boot).  These calls can occur at any time, but the driver is
175  expected to attempt to ensure (to the extent practical given
176  interrupt latency interactions) that they occur near tick boundaries
177  (i.e. not "halfway through" a tick), and most importantly that they
178  be correct over time and subject to minimal skew vs. other counters
179  and real world time.
180
181* The driver is expected to provide a :c:func:`sys_clock_set_timeout` call
182  to the kernel which indicates how many ticks may elapse before the
183  kernel must receive an announce call to trigger registered timeouts.
184  It is legal to announce new ticks before that moment (though they
185  must be correct) but delay after that will cause events to be
186  missed.  Note that the timeout value passed here is in a delta from
187  current time, but that does not absolve the driver of the
188  requirement to provide ticks at a steady rate over time.  Naive
189  implementations of this function are subject to bugs where the
190  fractional tick gets "reset" incorrectly and causes clock skew.
191
192* The driver is expected to provide a :c:func:`sys_clock_elapsed` call which
193  provides a current indication of how many ticks have elapsed (as
194  compared to a real world clock) since the last call to
195  :c:func:`sys_clock_announce`, which the kernel needs to test newly
196  arriving timeouts for expiration.
197
198Note that a natural implementation of this API results in a "tickless"
199kernel, which receives and processes timer interrupts only for
200registered events, relying on programmable hardware counters to
201provide irregular interrupts.  But a traditional, "ticked" or "dumb"
202counter driver can be trivially implemented also:
203
204* The driver can receive interrupts at a regular rate corresponding to
205  the OS tick rate, calling :c:func:`sys_clock_announce` with an argument of one
206  each time.
207
208* The driver can ignore calls to :c:func:`sys_clock_set_timeout`, as every
209  tick will be announced regardless of timeout status.
210
211* The driver can return zero for every call to :c:func:`sys_clock_elapsed`
212  as no more than one tick can be detected as having elapsed (because
213  otherwise an interrupt would have been received).
214
215
216SMP Details
217-----------
218
219In general, the timer API described above does not change when run in
220a multiprocessor context.  The kernel will internally synchronize all
221access appropriately, and ensure that all critical sections are small
222and minimal.  But some notes are important to detail:
223
224* Zephyr is agnostic about which CPU services timer interrupts.  It is
225  not illegal (though probably undesirable in some circumstances) to
226  have every timer interrupt handled on a single processor.  Existing
227  SMP architectures implement symmetric timer drivers.
228
229* The :c:func:`sys_clock_announce` call is expected to be globally
230  synchronized at the driver level.  The kernel does not do any
231  per-CPU tracking, and expects that if two timer interrupts fire near
232  simultaneously, that only one will provide the current tick count to
233  the timing subsystem.  The other may legally provide a tick count of
234  zero if no ticks have elapsed.  It should not "skip" the announce
235  call because of timeslicing requirements (see below).
236
237* Some SMP hardware uses a single, global timer device, others use a
238  per-CPU counter.  The complexity here (for example: ensuring counter
239  synchronization between CPUs) is expected to be managed by the
240  driver, not the kernel.
241
242* The next timeout value passed back to the driver via
243  :c:func:`sys_clock_set_timeout` is done identically for every CPU.
244  So by default, every CPU will see simultaneous timer interrupts for
245  every event, even though by definition only one of them should see a
246  non-zero ticks argument to :c:func:`sys_clock_announce`.  This is probably
247  a correct default for timing sensitive applications (because it
248  minimizes the chance that an errant ISR or interrupt lock will delay
249  a timeout), but may be a performance problem in some cases.  The
250  current design expects that any such optimization is the
251  responsibility of the timer driver.
252
253Time Slicing
254------------
255
256An auxiliary job of the timing subsystem is to provide tick counters
257to the scheduler that allow implementation of time slicing of threads.
258A thread time-slice cannot be a timeout value, as it does not reflect
259a global expiration but instead a per-CPU value that needs to be
260tracked independently on each CPU in an SMP context.
261
262Because there may be no other hardware available to drive timeslicing,
263Zephyr multiplexes the existing timer driver.  This means that the
264value passed to :c:func:`sys_clock_set_timeout` may be clamped to a
265smaller value than the current next timeout when a time sliced thread
266is currently scheduled.
267
268Subsystems that keep millisecond APIs
269-------------------------------------
270
271In general, code like this will port just like applications code will.
272Millisecond values from the user may be treated any way the subsystem
273likes, and then converted into kernel timeouts using
274:c:macro:`K_MSEC()` at the point where they are presented to the
275kernel.
276
277Obviously this comes at the cost of not being able to use new
278features, like the higher precision timeout constructors or absolute
279timeouts.  But for many subsystems with simple needs, this may be
280acceptable.
281
282One complexity is :c:macro:`K_FOREVER`.  Subsystems that might have
283been able to accept this value to their millisecond API in the past no
284longer can, because it is no longer an integral type.  Such code
285will need to use a different, integer-valued token to represent
286"forever".  :c:macro:`K_NO_WAIT` has the same typesafety concern too,
287of course, but as it is (and has always been) simply a numerical zero,
288it has a natural porting path.
289
290Subsystems using ``k_timeout_t``
291--------------------------------
292
293Ideally, code that takes a "timeout" parameter specifying a time to
294wait should be using the kernel native abstraction where possible.
295But :c:type:`k_timeout_t` is opaque, and needs to be converted before
296it can be inspected by an application.
297
298Some conversions are simple.  Code that needs to test for
299:c:macro:`K_FOREVER` can simply use the :c:macro:`K_TIMEOUT_EQ()`
300macro to test the opaque struct for equality and take special action.
301
302The more complicated case is when the subsystem needs to take a
303timeout and loop, waiting for it to finish while doing some processing
304that may require multiple blocking operations on underlying kernel
305code.  For example, consider this design:
306
307.. code-block:: c
308
309    void my_wait_for_event(struct my_subsys *obj, int32_t timeout_in_ms)
310    {
311        while (true) {
312            uint32_t start = k_uptime_get_32();
313
314            if (is_event_complete(obj)) {
315                return;
316            }
317
318            /* Wait for notification of state change */
319            k_sem_take(obj->sem, timeout_in_ms);
320
321            /* Subtract elapsed time */
322            timeout_in_ms -= (k_uptime_get_32() - start);
323        }
324    }
325
326This code requires that the timeout value be inspected, which is no
327longer possible.  For situations like this, the new API provides the
328internal :c:func:`sys_timepoint_calc` and :c:func:`sys_timepoint_timeout` routines
329that converts an arbitrary timeout to and from a timepoint value based on
330an uptime tick at which it will expire.  So such a loop might look like:
331
332
333.. code-block:: c
334
335    void my_wait_for_event(struct my_subsys *obj, k_timeout_t timeout)
336    {
337        /* Compute the end time from the timeout */
338        k_timepoint_t end = sys_timepoint_calc(timeout);
339
340        do {
341            if (is_event_complete(obj)) {
342                return;
343            }
344
345            /* Update timeout with remaining time */
346            timeout = sys_timepoint_timeout(end);
347
348            /* Wait for notification of state change */
349            k_sem_take(obj->sem, timeout);
350        } while (!K_TIMEOUT_EQ(timeout, K_NO_WAIT));
351    }
352
353Note that :c:func:`sys_timepoint_calc` accepts special values :c:macro:`K_FOREVER`
354and :c:macro:`K_NO_WAIT`, and works identically for absolute timeouts as well
355as conventional ones. Conversely, :c:func:`sys_timepoint_timeout` may return
356:c:macro:`K_FOREVER` or :c:macro:`K_NO_WAIT` if those were used to create
357the timepoint, the later also being returned if the timepoint is now in the
358past. For simple cases, :c:func:`sys_timepoint_expired` can be used as well.
359
360But some care is still required for subsystems that use those.  Note that
361delta timeouts need to be interpreted relative to a "current time",
362and obviously that time is the time of the call to
363:c:func:`sys_timepoint_calc`.  But the user expects that the time is
364the time they passed the timeout to you.  Care must be taken to call
365this function just once, as synchronously as possible to the timeout
366creation in user code.  It should not be used on a "stored" timeout
367value, and should never be called iteratively in a loop.
368
369
370API Reference
371*************
372
373.. doxygengroup:: clock_apis
374