1.. _cache_guide:
2
3Caching Basics
4##############
5
6This section discusses the basics of cache coherency and under what situations a
7user needs to explicitly deal with caching. For more detailed info on Zephyr's
8caching tools, see :ref:`cache_config` for Zephyr Kconfig options or
9:ref:`cache_api` for the API reference. This section primarily focuses on the
10data cache though there is typically also an instruction cache for systems with
11cache support.
12
13.. note::
14
15  The information here assumes that the architecture-specific MPU support is
16  enabled. See the architecture-specific documentation for details.
17
18.. note::
19
20  While cache coherence can be a concern for data shared between SMP cores, Zephyr
21  in general ensures that memory will be seen in a coherent state from multiple
22  cores. Most applications will only need to use the cache APIs for interaction
23  with external hardware like DMA controllers or foreign CPUs running a
24  different OS image. For more information on cache coherence between SMP cores,
25  see :kconfig:option:`CONFIG_KERNEL_COHERENCE`.
26
27When dealing with memory shared between a processor core and other bus masters,
28cache coherency needs to be considered. Typically processor caches exist as
29close to each processor core as possible to maximize performance gain. Because
30of this, data moved into and out of memory by DMA engines will be stale in the
31processor's cache, resulting in what appears to be corrupt data. If you are
32moving data using DMA and the processor doesn't see the data you expect, cache
33coherency may be the issue.
34
35There are multiple approaches to ensuring that the data seen by the processor
36core and peripherals is coherent. The simplest is just to disable caching, but
37this defeats the purpose of having a hardware cache in the first place and
38results in a significant performance hit. Many architectures provide methods for
39disabling caching for only a portion of memory. This can be useful when cache
40coherence is more important than performance, such as when using DMA with SPI.
41Finally, there is the option to flush or invalidate the cache for regions of
42memory at runtime.
43
44Globally Disabling the Data Cache
45---------------------------------
46
47As mentioned above, globally disabling data caching can have a significant
48performance impact but can be useful for debugging.
49
50Requirements:
51
52* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
53
54* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled.
55
56* Call :c:func:`sys_cache_data_disable()` to globally disable the data cache.
57
58Disabling Caching for a Memory Region
59-------------------------------------
60
61Disabling caching for only a portion of memory can be a good performance
62compromise if performance on the uncached memory is not critical to the
63application. This is a good option if the application requires many small
64unrelated buffers that are smaller than a cache line.
65
66Requirements:
67
68* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
69
70* :kconfig:option:`CONFIG_MEM_ATTR`: enable the ``mem-attr`` library for
71  handling memory attributes in the device tree.
72
73* Annotate your device tree according to :ref:`mem_mgmt_api`.
74
75Assuming the MPU driver is enabled, it will configure the specified regions
76according to the memory attributes specified during kernel initialization. When
77using a dedicated uncached region of memory, the linker needs to be instructed
78to place buffers into that region. This can be accomplished by specifying the
79memory region explicitly using ``Z_GENERIC_SECTION``:
80
81.. code-block:: c
82
83  /* SRAM4 marked as uncached in device tree */
84  uint8_t buffer[BUF_SIZE] Z_GENERIC_SECTION("SRAM4");
85
86.. note::
87
88  Configuring a distinct memory region with separate caching rules requires the
89  use of an MPU region which may be a limited resource on some architectures.
90  MPU regions may be needed by other memory protection features such as
91  :ref:`userspace <mpu_userspace>`, :ref:`stack protection <mpu_stack_objects>`,
92  or :ref:`memory domains<memory_domain>`.
93
94Automatically Disabling Caching by Variable
95-------------------------------------------
96
97Zephyr has the ability to automatically define an uncached region in memory and
98allocate variables to it using ``__nocache``. Any variables marked with this
99attribute will be placed in a special ``nocache`` linker region in memory. This
100region will be configured as uncached by the MPU driver during initialization.
101This is a simpler option than explicitly declaring a region of memory uncached
102but provides less control over the placement of these variables, as the linker
103may allocate this region anywhere in RAM.
104
105Requirements:
106
107* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
108
109* :kconfig:option:`CONFIG_NOCACHE_MEMORY`: enable allocation of the ``nocache``
110  linker region and configure it as uncached.
111
112* Add the ``__nocache`` attribute at the end of any uncached buffer definition:
113
114.. code-block:: c
115
116  uint8_t buffer[BUF_SIZE] __nocache;
117
118.. note::
119
120  See note above regarding possible limitations on MPU regions. The ``nocache``
121  region is still a distinct MPU region even though it is automatically created
122  by Zephyr instead of being explicitly defined by the user.
123
124Runtime Cache Control
125---------------------
126
127The most performant but most complex option is to control data caching at
128runtime. The two most relevant cache operations in this case are **flushing**
129and **invalidating**. Both of these operations operate on the smallest unit of
130cacheable memory, the cache line. Data cache lines are typically 16 to 128
131bytes. See :kconfig:option:`CONFIG_DCACHE_LINE_SIZE`. Cache line sizes are
132typically fixed in hardware and not configurable, but Zephyr does need to know
133the size of cache lines in order to correctly and efficiently manage the cache.
134If the buffers in question are smaller than the data cache line size, it may be
135more efficient to place them in an uncached region, as unrelated data packed
136into the same cache line may be destroyed when invalidating.
137
138Flushing the cache involves writing all modified cache lines in a specified
139region back to shared memory. Flush the cache associated with a buffer after the
140processor has written to it and before a remote bus master reads from that
141region.
142
143.. note::
144
145  Some architectures support a cache configuration called **write-through**
146  caching in which data writes from the processor core propagate through to
147  shared memory. While this solves the cache coherence problem for CPU writes,
148  it also results in more traffic to main memory which may result in performance
149  degradation.
150
151Invalidating the cache works similarly but in the other direction. It marks
152cache lines in the specified region as stale, ensuring that the cache line will
153be refreshed from main memory when the processor next reads from the specified
154region. Invalidate the data cache of a buffer that a peripheral has written to
155before reading from that region.
156
157In some cases, the same buffer may be reused for e.g. DMA reads and DMA writes.
158In that case it is possible to first flush the cache associated with a buffer
159and then invalidate it, ensuring that the cache will be refreshed the next time
160the processor reads from the buffer.
161
162Requirements:
163
164* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr.
165
166* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled.
167
168* Call :c:func:`sys_cache_data_flush_range()` to flush a memory region.
169
170* Call :c:func:`sys_cache_data_invd_range()` to invalidate a memory region.
171
172* Call :c:func:`sys_cache_data_flush_and_invd_range()` to flush and invalidate.
173
174Alignment
175---------
176
177As mentioned in :c:func:`sys_cache_data_invd_range()` and associated functions,
178buffers should be aligned to the cache line size. This can be accomplished by
179using ``__aligned``:
180
181.. code-block:: c
182
183  uint8_t buffer[BUF_SIZE] __aligned(CONFIG_DCACHE_LINE_SIZE);
184