1.. _fatal:
2
3Fatal Errors
4############
5
6Software Errors Triggered in Source Code
7****************************************
8
9Zephyr provides several methods for inducing fatal error conditions through
10either build-time checks, conditionally compiled assertions, or deliberately
11invoked panic or oops conditions.
12
13Runtime Assertions
14==================
15
16Zephyr provides some macros to perform runtime assertions which may be
17conditionally compiled. Their definitions may be found in
18:zephyr_file:`include/zephyr/sys/__assert.h`.
19
20Assertions are enabled by setting the ``__ASSERT_ON`` preprocessor symbol to a
21non-zero value. There are two ways to do this:
22
23- Use the :kconfig:option:`CONFIG_ASSERT` and :kconfig:option:`CONFIG_ASSERT_LEVEL` kconfig
24  options.
25- Add ``-D__ASSERT_ON=<level>`` to the project's CFLAGS, either on the
26  build command line or in a CMakeLists.txt.
27
28The ``__ASSERT_ON`` method takes precedence over the kconfig option if both are
29used.
30
31Specifying an assertion level of 1 causes the compiler to issue warnings that
32the kernel contains debug-type ``__ASSERT()`` statements; this reminder is
33issued since assertion code is not normally present in a final product.
34Specifying assertion level 2 suppresses these warnings.
35
36Assertions are enabled by default when running Zephyr test cases, as
37configured by the :kconfig:option:`CONFIG_TEST` option.
38
39The policy for what to do when encountering a failed assertion is controlled
40by the implementation of :c:func:`assert_post_action`. Zephyr provides
41a default implementation with weak linkage which invokes a kernel oops if
42the thread that failed the assertion was running in user mode, and a kernel
43panic otherwise.
44
45__ASSERT()
46----------
47
48The ``__ASSERT()`` macro can be used inside kernel and application code to
49perform optional runtime checks which will induce a fatal error if the
50check does not pass. The macro takes a string message which will be printed
51to provide context to the assertion. In addition, the kernel will print
52a text representation of the expression code that was evaluated, and the
53file and line number where the assertion can be found.
54
55For example:
56
57.. code-block:: c
58
59  __ASSERT(foo == 0xF0CACC1A, "Invalid value of foo, got 0x%x", foo);
60
61If at runtime ``foo`` had some unexpected value, the error produced may
62look like the following:
63
64.. code-block:: none
65
66	ASSERTION FAIL [foo == 0xF0CACC1A] @ ZEPHYR_BASE/tests/kernel/fatal/src/main.c:367
67		Invalid value of foo, got 0xdeadbeef
68	[00:00:00.000,000] <err> os: r0/a1:  0x00000004  r1/a2:  0x0000016f  r2/a3:  0x00000000
69	[00:00:00.000,000] <err> os: r3/a4:  0x00000000 r12/ip:  0x00000000 r14/lr:  0x00000a6d
70	[00:00:00.000,000] <err> os:  xpsr:  0x61000000
71	[00:00:00.000,000] <err> os: Faulting instruction address (r15/pc): 0x00009fe4
72	[00:00:00.000,000] <err> os: >>> ZEPHYR FATAL ERROR 4: Kernel panic
73	[00:00:00.000,000] <err> os: Current thread: 0x20000414 (main)
74	[00:00:00.000,000] <err> os: Halting system
75
76__ASSERT_EVAL()
77---------------
78
79The ``__ASSERT_EVAL()`` macro can also be used inside kernel and application
80code, with special semantics for the evaluation of its arguments.
81
82It makes use of the ``__ASSERT()`` macro, but has some extra flexibility. It
83allows the developer to specify different actions depending whether the
84``__ASSERT()`` macro is enabled or not.  This can be particularly useful to
85prevent the compiler from generating comments (errors, warnings or remarks)
86about variables that are only used with ``__ASSERT()`` being assigned a value,
87but otherwise unused when the ``__ASSERT()`` macro is disabled.
88
89Consider the following example:
90
91.. code-block:: c
92
93  int x;
94  x = foo();
95  __ASSERT(x != 0, "foo() returned zero!");
96
97If ``__ASSERT()`` is disabled, then 'x' is assigned a value, but never used.
98This type of situation can be resolved using the __ASSERT_EVAL() macro.
99
100.. code-block:: c
101
102  __ASSERT_EVAL ((void) foo(),
103  		 int x = foo(),
104                 x != 0,
105                 "foo() returned zero!");
106
107The first parameter tells ``__ASSERT_EVAL()`` what to do if ``__ASSERT()`` is
108disabled.  The second parameter tells ``__ASSERT_EVAL()`` what to do if
109``__ASSERT()`` is enabled.  The third and fourth parameters are the parameters
110it passes to ``__ASSERT()``.
111
112__ASSERT_NO_MSG()
113-----------------
114
115The ``__ASSERT_NO_MSG()`` macro can be used to perform an assertion that
116reports the failed test and its location, but lacks additional debugging
117information provided to assist the user in diagnosing the problem; its use is
118discouraged.
119
120Build Assertions
121================
122
123Zephyr provides two macros for performing build-time assertion checks.
124These are evaluated completely at compile-time, and are always checked.
125
126BUILD_ASSERT()
127--------------
128
129This has the same semantics as C's ``_Static_assert`` or C++'s
130``static_assert``. If the evaluation fails, a build error will be generated by
131the compiler. If the compiler supports it, the provided message will be printed
132to provide further context.
133
134Unlike ``__ASSERT()``, the message must be a static string, without
135:c:func:`printf()`-like format codes or extra arguments.
136
137For example, suppose this check fails:
138
139.. code-block:: c
140
141	BUILD_ASSERT(FOO == 2000, "Invalid value of FOO");
142
143With GCC, the output resembles:
144
145.. code-block:: none
146
147	tests/kernel/fatal/src/main.c: In function 'test_main':
148	include/toolchain/gcc.h:28:37: error: static assertion failed: "Invalid value of FOO"
149	 #define BUILD_ASSERT(EXPR, MSG) _Static_assert(EXPR, "" MSG)
150					 ^~~~~~~~~~~~~~
151	tests/kernel/fatal/src/main.c:370:2: note: in expansion of macro 'BUILD_ASSERT'
152	  BUILD_ASSERT(FOO == 2000,
153	  ^~~~~~~~~~~~~~~~
154
155Kernel Oops
156===========
157
158A kernel oops is a software triggered fatal error invoked by
159:c:func:`k_oops()`.  This should be used to indicate an unrecoverable condition
160in application logic.
161
162The fatal error reason code generated will be ``K_ERR_KERNEL_OOPS``.
163
164Kernel Panic
165============
166
167A kernel error is a software triggered fatal error invoked by
168:c:func:`k_panic()`.  This should be used to indicate that the Zephyr kernel is
169in an unrecoverable state. Implementations of
170:c:func:`k_sys_fatal_error_handler()` should not return if the kernel
171encounters a panic condition, as the entire system needs to be reset.
172
173Threads running in user mode are not permitted to invoke :c:func:`k_panic()`,
174and doing so will generate a kernel oops instead. Otherwise, the fatal error
175reason code generated will be ``K_ERR_KERNEL_PANIC``.
176
177Exceptions
178**********
179
180Spurious Interrupts
181===================
182
183If the CPU receives a hardware interrupt on an interrupt line that has not had
184a handler installed with ``IRQ_CONNECT()`` or :c:func:`irq_connect_dynamic()`,
185then the kernel will generate a fatal error with the reason code
186``K_ERR_SPURIOUS_IRQ()``.
187
188Stack Overflows
189===============
190
191In the event that a thread pushes more data onto its execution stack than its
192stack buffer provides, the kernel may be able to detect this situation and
193generate a fatal error with a reason code of ``K_ERR_STACK_CHK_FAIL``.
194
195If a thread is running in user mode, then stack overflows are always caught,
196as the thread will simply not have permission to write to adjacent memory
197addresses outside of the stack buffer. Because this is enforced by the
198memory protection hardware, there is no risk of data corruption to memory
199that the thread would not otherwise be able to write to.
200
201If a thread is running in supervisor mode, or if :kconfig:option:`CONFIG_USERSPACE` is
202not enabled, depending on configuration stack overflows may or may not be
203caught.  :kconfig:option:`CONFIG_HW_STACK_PROTECTION` is supported on some
204architectures and will catch stack overflows in supervisor mode, including
205when handling a system call on behalf of a user thread. Typically this is
206implemented via dedicated CPU features, or read-only MMU/MPU guard regions
207placed immediately adjacent to the stack buffer. Stack overflows caught in this
208way can detect the overflow, but cannot guarantee against data corruption and
209should be treated as a very serious condition impacting the health of the
210entire system.
211
212If a platform lacks memory management hardware support,
213:kconfig:option:`CONFIG_STACK_SENTINEL` is a software-only stack overflow detection
214feature which periodically checks if a sentinel value at the end of the stack
215buffer has been corrupted. It does not require hardware support, but provides
216no protection against data corruption. Since the checks are typically done at
217interrupt exit, the overflow may be detected a nontrivial amount of time after
218the stack actually overflowed.
219
220Finally, Zephyr supports GCC compiler stack canaries via
221:kconfig:option:`CONFIG_STACK_CANARIES`.  If enabled, the compiler will insert a canary
222value randomly generated at boot into function stack frames, checking that the
223canary has not been overwritten at function exit. If the check fails, the
224compiler invokes :c:func:`__stack_chk_fail()`, whose Zephyr implementation
225invokes a fatal stack overflow error. An error in this case does not indicate
226that the entire stack buffer has overflowed, but instead that the current
227function stack frame has been corrupted. See the compiler documentation for
228more details.
229
230Other Exceptions
231================
232
233Any other type of unhandled CPU exception will generate an error code of
234``K_ERR_CPU_EXCEPTION``.
235
236Fatal Error Handling
237********************
238
239The policy for what to do when encountering a fatal error is determined by the
240implementation of the :c:func:`k_sys_fatal_error_handler()` function.  This
241function has a default implementation with weak linkage that calls
242``LOG_PANIC()`` to dump all pending logging messages and then unconditionally
243halts the system with :c:func:`k_fatal_halt()`.
244
245Applications are free to implement their own error handling policy by
246overriding the implementation of :c:func:`k_sys_fatal_error_handler()`.
247If the implementation returns, the faulting thread will be aborted and
248the system will otherwise continue to function. See the documentation for
249this function for additional details and constraints.
250
251API Reference
252*************
253
254.. doxygengroup:: fatal_apis
255