README_MMU.txt - OpenGrok cross reference for /Zephyr-latest/arch/xtensa/core/README

Lines Matching full:the
3 As with other elements of the architecture, paged virtual memory
6 to introduce the architecture at an overview/tutorial level, and to
11 The Xtensa MMU operates on top of a fairly conventional TLB cache.
12 The TLB stores virtual to physical translation for individual pages of
20 Like the L1 cache, the TLB is split into separate instruction and data
21 entries.  Zephyr manages both as needed, but symmetrically.  The
23 and data spaces, but the hardware page table refill mechanism (see
26 The TLB may be loaded with permissions and attributes controlling
27 cacheability, access control based on ring (i.e. the contents of the
28 RING field of the PS register) and togglable write and execute access.
34 their ring field of the PTE that loaded them, via a simple translation
35 specified in the RASID special register.  The intent is that each
37 such that you can switch between them without a TLB flush.  The ASID
46 Xtensa has a unique (and, to someone exposed for the first time,
47 extremely confusing) "page table" format.  The simplest was to begin
48 to explain this is just to describe the (quite simple) hardware
51 On a TLB miss, the hardware immediately does a single fetch (at ring 0
52 privilege) from RAM by adding the "desired address right shifted by
53 10 bits with the bottom two bits set to zero" (i.e. the page frame
54 number in units of 4 bytes) to the value in the PTEVADDR special
55 register.  If this load succeeds, then the word is treated as a PTE
56 with which to fill the TLB and use for a (restarted) memory access.
58 one thing the hardware can already do), and quite fast (only one
59 memory fetch vs. e.g. the 2-5 fetches required to walk a page table on
63 access, meaning it too uses the TLB to translate from a virtual to
64 physical address.  Which means that the page tables occupy a 4M region
65 of virtual, not physical, address space, in the same memory space
66 occupied by the running code.  The 1024 pages in that range (not all
70 contains the 1024 PTE entries for the 4M page table itself, pointed to
73 Obviously, the page table memory being virtual means that the fetch
75 covering all of memory, and the ~16 entry TLB clearly won't contain
76 entries mapping all of them.  If we are missing a TLB entry for the
77 page translation we want (NOT for the original requested address, we
78 already know we're missing that TLB entry), the hardware has exactly
83 The job of that exception handler is simply to ensure that the TLB has
84 an entry for the page table page we want.  And the simplest way to do
85 that is to just load the faulting PTE as an address, which will then
86 go through the same refill process above.  This second TLB fetch in
87 the exception handler may result in an invalid/inapplicable mapping
88 within the 4M page table region.  This is an typical/expected runtime
89 fault, and simply indicates unmapped memory.  The result is TLB miss
90 exception from within the TLB miss exception handler (i.e. while the
92 is handled by the OS identically to a general Kernel/User data access
95 After the TLB refill exception, the original faulting instruction is
96 restarted, which retries the refill process, which succeeds in
97 fetching a new TLB entry, which is then used to service the original
99 turns out that the TLB entry doesn't permit the access requested, of
104 The page-tables-specified-in-virtual-memory trick works very well in
105 practice.  But it does have a chicken/egg problem with the initial
106 state.  Because everything depends on state in the TLB, something
107 needs to tell the hardware how to find a physical address using the
108 TLB to begin the process.  Here we exploit the separate
111 First, note that the refill process to load a PTE requires that the 4M
112 space of PTE entries be resolvable by the TLB directly, without
114 page of PTE entries (which itself lives in the 4M page table region!).
115 This page must always be in the TLB.
117 Thankfully, for the data TLB Xtensa provides 3 special/non-refillable
119 one of these to "pin" the top-level page table entry in place,
122 But now note that the load from that PTE address for the refill is
124 requires doing a fetch via the instruction TLB.  And that obviously
125 means that the page(s) containing the exception handler must never
128 Ideally we would just pin the vector/handler page in the ITLB in the
130 provide 4k "pinnable" ways in the instruction TLB (frankly this seems
133 Instead, we load ITLB entries for vector handlers via the refill
134 mechanism using the data TLB, and so need the refill mechanism for the
135 vector page to succeed always.  The way to do this is to similarly pin
136 the page table page containing the (single) PTE for the vector page in
137 the data TLB, such that instruction fetches always find their TLB
143 for the MMU.  Virtual address translation through the TLB is active at
144 all times.  There therefore needs to be a mechanism for the CPU to
145 execute code before the OS is able to initialize a refillable page
148 The way Xtensa resolves this (on the hardware Zephyr supports, see the
152 (i.e. the fixed ring zero / kernel ASID), writable, executable, and
153 uncached.  So at boot the CPU relies on these TLB entries to provide a
157 care, as the CPU will throw an exception ("multi hit") if a memory
158 access matches more than one live entry in the TLB.  The
161 0. Start with a fully-initialized page table layout, including the
162    top-level "L1" page containing the mappings for the page table
165 1. Ensure that the initialization routine does not cross a page
167    a separate 4k page than the exception vectors (which we must
171 2. Pin the L1 page table PTE into the data TLB.  This creates a double
175 3. Pin the page table page containing the PTE for the TLB miss
176    exception handler into the data TLB.  This will likewise not be
177    accessed until the double map condition is resolved.
179 4. Set PTEVADDR appropriately.  The CPU state to handle refill
180    exceptions is now complete, but cannot be used until we resolve the
183 5. Disable the initial/way6 data TLB entries first, by setting them to
184    an ASID of zero.  This is safe as the code being executed is not
185    doing data accesses yet (including refills), and will resolve the
188 6. Disable the initial/way6 instruction TLBs second.  The very next
189    instruction following the invalidation of the currently-executing
191    normally because we just resolved the final double-map condition.
192    (Pedantic note: if the vector page and the currently-executing page
193    are in different 512M way6 pages, disable the mapping for the
194    exception handlers first so the trap from our current code can be
198 Note: there is a different variant of the Xtensa MMU architecture
199 where the way 5/6 pages are immutable, and specify a set of
200 unchangable mappings from the final 384M of memory to the bottom and
201 top of physical memory.  The intent here would (presumably) be that
202 these would be used by the kernel for all physical memory and that the
212 The ASID mechanism in Xtensa works like other architectures, and is
213 intended to be used similarly.  The intent of the design is that at
214 context switch time, you can simply change RADID and the page table
215 data, and leave any existing mappings in place in the TLB using the
216 old ASID value(s).  So in the common case where you switch back,
219 Unfortunately this runs afoul of the virtual mapping of the page
220 refill: data TLB entries storing the 4M page table mapping space are
221 stored at ASID 1 (ring 0), they can't change when the page tables
223 tantamount to flushing the entire TLB regardless (the TLB is much
224 smaller than the 1024-page PTE array).
226 The resolution in Zephyr is to give each ASID its own PTEVADDR mapping
227 in virtual space, such that the page tables don't overlap.  This is
229 the 256 ASIDs (actually 254 as 0 and 1 are never used by user access)
231 bit by deriving a unique sequential ASID from the hardware address of
232 the statically allocated array of L1 page table pages.
234 Note, obviously, that any change of the mappings within an ASID
241 A final important note is that the hardware PTE refill fetch works
243 the cacheability attributes of the TLB entry through which it was
244 loaded.  This means that if the page table entries are marked
245 cacheable, then the hardware TLB refill process will be downstream of
246 the L1 data cache on the CPU.  If the physical memory storing page
247 tables has been accessed recently by the CPU (for a refill of another
248 page mapped within the same cache line, or to change the tables) then
249 the refill will be served from the data cache and not main memory.
252 lets the L1 data cache act as a "L2 TLB" for applications with a lot
253 of access variability.  But it also means that the TLB entries end up
254 being stored twice in the same CPU, wasting transistors that could
257 But it is also important to note that the L1 data cache on Xtensa is
258 incoherent!  The cache being used for refill reflects the last access
259 on the current CPU only, and not of the underlying memory being
260 mapped.  Page table changes in the data cache of one CPU will be
261 invisible to the data cache of another.  There is no simple way of
266 The result is that, when SMP is enabled, Zephyr must ensure that all
267 page table mappings in the system are set uncached.  The OS makes no