1.. _cache_guide: 2 3Caching Basics 4############## 5 6This section discusses the basics of cache coherency and under what situations a 7user needs to explicitly deal with caching. For more detailed info on Zephyr's 8caching tools, see :ref:`cache_config` for Zephyr Kconfig options or 9:ref:`cache_api` for the API reference. This section primarily focuses on the 10data cache though there is typically also an instruction cache for systems with 11cache support. 12 13.. note:: 14 15 The information here assumes that the architecture-specific MPU support is 16 enabled. See the architecture-specific documentation for details. 17 18.. note:: 19 20 While cache coherence can be a concern for data shared between SMP cores, Zephyr 21 in general ensures that memory will be seen in a coherent state from multiple 22 cores. Most applications will only need to use the cache APIs for interaction 23 with external hardware like DMA controllers or foreign CPUs running a 24 different OS image. For more information on cache coherence between SMP cores, 25 see :kconfig:option:`CONFIG_KERNEL_COHERENCE`. 26 27When dealing with memory shared between a processor core and other bus masters, 28cache coherency needs to be considered. Typically processor caches exist as 29close to each processor core as possible to maximize performance gain. Because 30of this, data moved into and out of memory by DMA engines will be stale in the 31processor's cache, resulting in what appears to be corrupt data. If you are 32moving data using DMA and the processor doesn't see the data you expect, cache 33coherency may be the issue. 34 35There are multiple approaches to ensuring that the data seen by the processor 36core and peripherals is coherent. The simplest is just to disable caching, but 37this defeats the purpose of having a hardware cache in the first place and 38results in a significant performance hit. Many architectures provide methods for 39disabling caching for only a portion of memory. This can be useful when cache 40coherence is more important than performance, such as when using DMA with SPI. 41Finally, there is the option to flush or invalidate the cache for regions of 42memory at runtime. 43 44Globally Disabling the Data Cache 45--------------------------------- 46 47As mentioned above, globally disabling data caching can have a significant 48performance impact but can be useful for debugging. 49 50Requirements: 51 52* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. 53 54* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled. 55 56* Call :c:func:`sys_cache_data_disable()` to globally disable the data cache. 57 58Disabling Caching for a Memory Region 59------------------------------------- 60 61Disabling caching for only a portion of memory can be a good performance 62compromise if performance on the uncached memory is not critical to the 63application. This is a good option if the application requires many small 64unrelated buffers that are smaller than a cache line. 65 66Requirements: 67 68* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. 69 70* :kconfig:option:`CONFIG_MEM_ATTR`: enable the ``mem-attr`` library for 71 handling memory attributes in the device tree. 72 73* Annotate your device tree according to :ref:`mem_mgmt_api`. 74 75Assuming the MPU driver is enabled, it will configure the specified regions 76according to the memory attributes specified during kernel initialization. When 77using a dedicated uncached region of memory, the linker needs to be instructed 78to place buffers into that region. This can be accomplished by specifying the 79memory region explicitly using ``Z_GENERIC_SECTION``: 80 81.. code-block:: c 82 83 /* SRAM4 marked as uncached in device tree */ 84 uint8_t buffer[BUF_SIZE] Z_GENERIC_SECTION("SRAM4"); 85 86.. note:: 87 88 Configuring a distinct memory region with separate caching rules requires the 89 use of an MPU region which may be a limited resource on some architectures. 90 MPU regions may be needed by other memory protection features such as 91 :ref:`userspace <mpu_userspace>`, :ref:`stack protection <mpu_stack_objects>`, 92 or :ref:`memory domains<memory_domain>`. 93 94Automatically Disabling Caching by Variable 95------------------------------------------- 96 97Zephyr has the ability to automatically define an uncached region in memory and 98allocate variables to it using ``__nocache``. Any variables marked with this 99attribute will be placed in a special ``nocache`` linker region in memory. This 100region will be configured as uncached by the MPU driver during initialization. 101This is a simpler option than explicitly declaring a region of memory uncached 102but provides less control over the placement of these variables, as the linker 103may allocate this region anywhere in RAM. 104 105Requirements: 106 107* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. 108 109* :kconfig:option:`CONFIG_NOCACHE_MEMORY`: enable allocation of the ``nocache`` 110 linker region and configure it as uncached. 111 112* Add the ``__nocache`` attribute at the end of any uncached buffer definition: 113 114.. code-block:: c 115 116 uint8_t buffer[BUF_SIZE] __nocache; 117 118.. note:: 119 120 See note above regarding possible limitations on MPU regions. The ``nocache`` 121 region is still a distinct MPU region even though it is automatically created 122 by Zephyr instead of being explicitly defined by the user. 123 124Runtime Cache Control 125--------------------- 126 127The most performant but most complex option is to control data caching at 128runtime. The two most relevant cache operations in this case are **flushing** 129and **invalidating**. Both of these operations operate on the smallest unit of 130cacheable memory, the cache line. Data cache lines are typically 16 to 128 131bytes. See :kconfig:option:`CONFIG_DCACHE_LINE_SIZE`. Cache line sizes are 132typically fixed in hardware and not configurable, but Zephyr does need to know 133the size of cache lines in order to correctly and efficiently manage the cache. 134If the buffers in question are smaller than the data cache line size, it may be 135more efficient to place them in an uncached region, as unrelated data packed 136into the same cache line may be destroyed when invalidating. 137 138Flushing the cache involves writing all modified cache lines in a specified 139region back to shared memory. Flush the cache associated with a buffer after the 140processor has written to it and before a remote bus master reads from that 141region. 142 143.. note:: 144 145 Some architectures support a cache configuration called **write-through** 146 caching in which data writes from the processor core propagate through to 147 shared memory. While this solves the cache coherence problem for CPU writes, 148 it also results in more traffic to main memory which may result in performance 149 degradation. 150 151Invalidating the cache works similarly but in the other direction. It marks 152cache lines in the specified region as stale, ensuring that the cache line will 153be refreshed from main memory when the processor next reads from the specified 154region. Invalidate the data cache of a buffer that a peripheral has written to 155before reading from that region. 156 157In some cases, the same buffer may be reused for e.g. DMA reads and DMA writes. 158In that case it is possible to first flush the cache associated with a buffer 159and then invalidate it, ensuring that the cache will be refreshed the next time 160the processor reads from the buffer. 161 162Requirements: 163 164* :kconfig:option:`CONFIG_DCACHE`: DCACHE control enabled in Zephyr. 165 166* :kconfig:option:`CONFIG_CACHE_MANAGEMENT`: cache API enabled. 167 168* Call :c:func:`sys_cache_data_flush_range()` to flush a memory region. 169 170* Call :c:func:`sys_cache_data_invd_range()` to invalidate a memory region. 171 172* Call :c:func:`sys_cache_data_flush_and_invd_range()` to flush and invalidate. 173 174Alignment 175--------- 176 177As mentioned in :c:func:`sys_cache_data_invd_range()` and associated functions, 178buffers should be aligned to the cache line size. This can be accomplished by 179using ``__aligned``: 180 181.. code-block:: c 182 183 uint8_t buffer[BUF_SIZE] __aligned(CONFIG_DCACHE_LINE_SIZE); 184