1Latency Measurements 2#################### 3 4This benchmark measures the average latency of selected kernel capabilities, 5including: 6 7* Context switch time between preemptive threads using k_yield 8* Context switch time between cooperative threads using k_yield 9* Time to switch from ISR back to interrupted thread 10* Time from ISR to executing a different thread (rescheduled) 11* Time to signal a semaphore then test that semaphore 12* Time to signal a semaphore then test that semaphore with a context switch 13* Times to lock a mutex then unlock that mutex 14* Time it takes to create a new thread (without starting it) 15* Time it takes to start a newly created thread 16* Time it takes to suspend a thread 17* Time it takes to resume a suspended thread 18* Time it takes to abort a thread 19* Time it takes to add data to a fifo.LIFO 20* Time it takes to retrieve data from a fifo.LIFO 21* Time it takes to wait on a fifo.lifo.(and context switch) 22* Time it takes to wake and switch to a thread waiting on a fifo.LIFO 23* Time it takes to send and receive events 24* Time it takes to wait for events (and context switch) 25* Time it takes to wake and switch to a thread waiting for events 26* Time it takes to push and pop to/from a k_stack 27* Measure average time to alloc memory from heap then free that memory 28 29When userspace is enabled, this benchmark will where possible, also test the 30above capabilities using various configurations involving user threads: 31 32* Kernel thread to kernel thread 33* Kernel thread to user thread 34* User thread to kernel thread 35* User thread to user thread 36 37The default configuration builds only for the kernel. However, additional 38configurations can be enabled via the use of EXTRA_CONF_FILE. 39 40For example, the following will build this project with userspace support: 41 42 EXTRA_CONF_FILE="prj.userspace.conf" west build -p -b <board> <path to project> 43 44The following table summarizes the purposes of the different extra 45configuration files that are available to be used with this benchmark. 46A tester may mix and match them allowing them different scenarios to 47be easily compared the default. Benchmark output can be saved and subsequently 48exported to third party tools to compare and chart performance differences 49both between configurations as well as across Zephyr versions. 50 51+-----------------------------+------------------------------------+ 52| prj.canaries.conf | Enable stack canaries | 53+-----------------------------+------------------------------------+ 54| prj.objcore.conf | Enable object cores and statistics | 55+-----------------------------+------------------------------------+ 56| prj.userspace.conf | Enable userspace support | 57+-----------------------------+------------------------------------+ 58 59Sample output of the benchmark using the defaults:: 60 61 thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 315 cycles , 2625 ns : 62 thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 315 cycles , 2625 ns : 63 isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 289 cycles , 2416 ns : 64 isr.resume.different.thread.kernel - Return from ISR to another thread : 374 cycles , 3124 ns : 65 thread.create.kernel.from.kernel - Create thread : 382 cycles , 3191 ns : 66 thread.start.kernel.from.kernel - Start thread : 394 cycles , 3291 ns : 67 thread.suspend.kernel.from.kernel - Suspend thread : 289 cycles , 2416 ns : 68 thread.resume.kernel.from.kernel - Resume thread : 339 cycles , 2833 ns : 69 thread.abort.kernel.from.kernel - Abort thread : 339 cycles , 2833 ns : 70 fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 214 cycles , 1791 ns : 71 fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 134 cycles , 1124 ns : 72 fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 834 cycles , 6950 ns : 73 fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 560 cycles , 4666 ns : 74 fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 510 cycles , 4257 ns : 75 fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 590 cycles , 4923 ns : 76 fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 510 cycles , 4250 ns : 77 fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 585 cycles , 4875 ns : 78 lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 214 cycles , 1791 ns : 79 lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 120 cycles , 1008 ns : 80 lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 831 cycles , 6925 ns : 81 lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 555 cycles , 4625 ns : 82 lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 502 cycles , 4191 ns : 83 lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 585 cycles , 4875 ns : 84 lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 513 cycles , 4275 ns : 85 lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 585 cycles , 4881 ns : 86 events.post.immediate.kernel - Post events (nothing wakes) : 225 cycles , 1875 ns : 87 events.set.immediate.kernel - Set events (nothing wakes) : 230 cycles , 1923 ns : 88 events.wait.immediate.kernel - Wait for any events (no ctx switch) : 120 cycles , 1000 ns : 89 events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 110 cycles , 917 ns : 90 events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 514 cycles , 4291 ns : 91 events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 754 cycles , 6291 ns : 92 events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 528 cycles , 4400 ns : 93 events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 765 cycles , 6375 ns : 94 semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 59 cycles , 492 ns : 95 semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 69 cycles , 575 ns : 96 semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 450 cycles , 3756 ns : 97 semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 509 cycles , 4249 ns : 98 condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 578 cycles , 4817 ns : 99 condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 630 cycles , 5250 ns : 100 stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 107 cycles , 899 ns : 101 stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 80 cycles , 674 ns : 102 stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 467 cycles , 3899 ns : 103 stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 550 cycles , 4583 ns : 104 mutex.lock.immediate.recursive.kernel - Lock a mutex : 83 cycles , 692 ns : 105 mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 44 cycles , 367 ns : 106 heap.malloc.immediate - Average time for heap malloc : 610 cycles , 5083 ns : 107 heap.free.immediate - Average time for heap free : 425 cycles , 3541 ns : 108 =================================================================== 109 PROJECT EXECUTION SUCCESSFUL 110 111 112Sample output of the benchmark with stack canaries enabled:: 113 114 thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 485 cycles , 4042 ns : 115 thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 485 cycles , 4042 ns : 116 isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 545 cycles , 4549 ns : 117 isr.resume.different.thread.kernel - Return from ISR to another thread : 590 cycles , 4924 ns : 118 thread.create.kernel.from.kernel - Create thread : 585 cycles , 4883 ns : 119 thread.start.kernel.from.kernel - Start thread : 685 cycles , 5716 ns : 120 thread.suspend.kernel.from.kernel - Suspend thread : 490 cycles , 4091 ns : 121 thread.resume.kernel.from.kernel - Resume thread : 569 cycles , 4749 ns : 122 thread.abort.kernel.from.kernel - Abort thread : 629 cycles , 5249 ns : 123 fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 439 cycles , 3666 ns : 124 fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 320 cycles , 2674 ns : 125 fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 1499 cycles , 12491 ns : 126 fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 1230 cycles , 10250 ns : 127 fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 868 cycles , 7241 ns : 128 fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 991 cycles , 8259 ns : 129 fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 879 cycles , 7325 ns : 130 fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 990 cycles , 8250 ns : 131 lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 429 cycles , 3582 ns : 132 lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 320 cycles , 2674 ns : 133 lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 1499 cycles , 12491 ns : 134 lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 1220 cycles , 10166 ns : 135 lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 863 cycles , 7199 ns : 136 lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 985 cycles , 8208 ns : 137 lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 879 cycles , 7325 ns : 138 lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 985 cycles , 8208 ns : 139 events.post.immediate.kernel - Post events (nothing wakes) : 420 cycles , 3501 ns : 140 events.set.immediate.kernel - Set events (nothing wakes) : 420 cycles , 3501 ns : 141 events.wait.immediate.kernel - Wait for any events (no ctx switch) : 280 cycles , 2334 ns : 142 events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 270 cycles , 2251 ns : 143 events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 919 cycles , 7665 ns : 144 events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1310 cycles , 10924 ns : 145 events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 954 cycles , 7950 ns : 146 events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1340 cycles , 11166 ns : 147 semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 110 cycles , 917 ns : 148 semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 180 cycles , 1500 ns : 149 semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 755 cycles , 6292 ns : 150 semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 812 cycles , 6767 ns : 151 condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1027 cycles , 8558 ns : 152 condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1040 cycles , 8666 ns : 153 stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 220 cycles , 1841 ns : 154 stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 205 cycles , 1716 ns : 155 stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 791 cycles , 6599 ns : 156 stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 870 cycles , 7250 ns : 157 mutex.lock.immediate.recursive.kernel - Lock a mutex : 175 cycles , 1459 ns : 158 mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 61 cycles , 510 ns : 159 heap.malloc.immediate - Average time for heap malloc : 1060 cycles , 8833 ns : 160 heap.free.immediate - Average time for heap free : 899 cycles , 7491 ns : 161 =================================================================== 162 PROJECT EXECUTION SUCCESSFUL 163 164The sample output above (stack canaries are enabled) shows longer times than 165for the default build. Not only does each stack frame in the call tree have 166its own stack canary check, but enabling this feature impacts how the compiler 167chooses to inline or not inline routines. 168 169Sample output of the benchmark with object core enabled:: 170 171 thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 740 cycles , 6167 ns : 172 thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 740 cycles , 6167 ns : 173 isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 284 cycles , 2374 ns : 174 isr.resume.different.thread.kernel - Return from ISR to another thread : 784 cycles , 6541 ns : 175 thread.create.kernel.from.kernel - Create thread : 714 cycles , 5958 ns : 176 thread.start.kernel.from.kernel - Start thread : 819 cycles , 6833 ns : 177 thread.suspend.kernel.from.kernel - Suspend thread : 704 cycles , 5874 ns : 178 thread.resume.kernel.from.kernel - Resume thread : 761 cycles , 6349 ns : 179 thread.abort.kernel.from.kernel - Abort thread : 544 cycles , 4541 ns : 180 fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 211 cycles , 1766 ns : 181 fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 132 cycles , 1108 ns : 182 fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 850 cycles , 7091 ns : 183 fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 565 cycles , 4708 ns : 184 fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 947 cycles , 7899 ns : 185 fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 1015 cycles , 8458 ns : 186 fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 950 cycles , 7923 ns : 187 fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1010 cycles , 8416 ns : 188 lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 226 cycles , 1891 ns : 189 lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 123 cycles , 1033 ns : 190 lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 848 cycles , 7066 ns : 191 lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 565 cycles , 4708 ns : 192 lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 951 cycles , 7932 ns : 193 lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 1010 cycles , 8416 ns : 194 lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 959 cycles , 7991 ns : 195 lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 1010 cycles , 8422 ns : 196 events.post.immediate.kernel - Post events (nothing wakes) : 210 cycles , 1750 ns : 197 events.set.immediate.kernel - Set events (nothing wakes) : 230 cycles , 1917 ns : 198 events.wait.immediate.kernel - Wait for any events (no ctx switch) : 120 cycles , 1000 ns : 199 events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 150 cycles , 1250 ns : 200 events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 951 cycles , 7932 ns : 201 events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1179 cycles , 9833 ns : 202 events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 976 cycles , 8133 ns : 203 events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1190 cycles , 9922 ns : 204 semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 59 cycles , 492 ns : 205 semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 69 cycles , 575 ns : 206 semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 870 cycles , 7250 ns : 207 semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 929 cycles , 7749 ns : 208 condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1010 cycles , 8417 ns : 209 condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1060 cycles , 8833 ns : 210 stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 90 cycles , 758 ns : 211 stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 86 cycles , 724 ns : 212 stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 910 cycles , 7589 ns : 213 stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 975 cycles , 8125 ns : 214 mutex.lock.immediate.recursive.kernel - Lock a mutex : 105 cycles , 875 ns : 215 mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 44 cycles , 367 ns : 216 heap.malloc.immediate - Average time for heap malloc : 621 cycles , 5183 ns : 217 heap.free.immediate - Average time for heap free : 422 cycles , 3516 ns : 218 =================================================================== 219 PROJECT EXECUTION SUCCESSFUL 220 221The sample output above (object core and statistics enabled) shows longer 222times than for the default build when context switching is involved. A blanket 223enabling of the object cores as was done here results in the additional 224gathering of thread statistics when a thread is switched in/out. The 225gathering of these statistics can be controlled at both at the time of 226project configuration as well as at runtime. 227 228Sample output of the benchmark with userspace enabled:: 229 230 thread.yield.preemptive.ctx.k_to_k - Context switch via k_yield : 975 cycles , 8125 ns : 231 thread.yield.preemptive.ctx.u_to_u - Context switch via k_yield : 1303 cycles , 10860 ns : 232 thread.yield.preemptive.ctx.k_to_u - Context switch via k_yield : 1180 cycles , 9834 ns : 233 thread.yield.preemptive.ctx.u_to_k - Context switch via k_yield : 1097 cycles , 9144 ns : 234 thread.yield.cooperative.ctx.k_to_k - Context switch via k_yield : 975 cycles , 8125 ns : 235 thread.yield.cooperative.ctx.u_to_u - Context switch via k_yield : 1302 cycles , 10854 ns : 236 thread.yield.cooperative.ctx.k_to_u - Context switch via k_yield : 1180 cycles , 9834 ns : 237 thread.yield.cooperative.ctx.u_to_k - Context switch via k_yield : 1097 cycles , 9144 ns : 238 isr.resume.interrupted.thread.kernel - Return from ISR to interrupted thread : 329 cycles , 2749 ns : 239 isr.resume.different.thread.kernel - Return from ISR to another thread : 1014 cycles , 8457 ns : 240 isr.resume.different.thread.user - Return from ISR to another thread : 1223 cycles , 10192 ns : 241 thread.create.kernel.from.kernel - Create thread : 970 cycles , 8089 ns : 242 thread.start.kernel.from.kernel - Start thread : 1074 cycles , 8957 ns : 243 thread.suspend.kernel.from.kernel - Suspend thread : 949 cycles , 7916 ns : 244 thread.resume.kernel.from.kernel - Resume thread : 1004 cycles , 8374 ns : 245 thread.abort.kernel.from.kernel - Abort thread : 2734 cycles , 22791 ns : 246 thread.create.user.from.kernel - Create thread : 832 cycles , 6935 ns : 247 thread.start.user.from.kernel - Start thread : 9023 cycles , 75192 ns : 248 thread.suspend.user.from.kernel - Suspend thread : 1312 cycles , 10935 ns : 249 thread.resume.user.from.kernel - Resume thread : 1187 cycles , 9894 ns : 250 thread.abort.user.from.kernel - Abort thread : 2597 cycles , 21644 ns : 251 thread.create.user.from.user - Create thread : 2144 cycles , 17872 ns : 252 thread.start.user.from.user - Start thread : 9399 cycles , 78330 ns : 253 thread.suspend.user.from.user - Suspend thread : 1504 cycles , 12539 ns : 254 thread.resume.user.from.user - Resume thread : 1574 cycles , 13122 ns : 255 thread.abort.user.from.user - Abort thread : 3237 cycles , 26981 ns : 256 thread.start.kernel.from.user - Start thread : 1452 cycles , 12102 ns : 257 thread.suspend.kernel.from.user - Suspend thread : 1143 cycles , 9525 ns : 258 thread.resume.kernel.from.user - Resume thread : 1392 cycles , 11602 ns : 259 thread.abort.kernel.from.user - Abort thread : 3372 cycles , 28102 ns : 260 fifo.put.immediate.kernel - Add data to FIFO (no ctx switch) : 239 cycles , 1999 ns : 261 fifo.get.immediate.kernel - Get data from FIFO (no ctx switch) : 184 cycles , 1541 ns : 262 fifo.put.alloc.immediate.kernel - Allocate to add data to FIFO (no ctx switch) : 920 cycles , 7666 ns : 263 fifo.get.free.immediate.kernel - Free when getting data from FIFO (no ctx switch) : 650 cycles , 5416 ns : 264 fifo.put.alloc.immediate.user - Allocate to add data to FIFO (no ctx switch) : 1710 cycles , 14256 ns : 265 fifo.get.free.immediate.user - Free when getting data from FIFO (no ctx switch) : 1440 cycles , 12000 ns : 266 fifo.get.blocking.k_to_k - Get data from FIFO (w/ ctx switch) : 1209 cycles , 10082 ns : 267 fifo.put.wake+ctx.k_to_k - Add data to FIFO (w/ ctx switch) : 1230 cycles , 10250 ns : 268 fifo.get.free.blocking.k_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 1210 cycles , 10083 ns : 269 fifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1260 cycles , 10500 ns : 270 fifo.get.free.blocking.u_to_k - Free when getting data from FIFO (w/ ctx siwtch) : 1745 cycles , 14547 ns : 271 fifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to FIFO (w/ ctx switch) : 1600 cycles , 13333 ns : 272 fifo.get.free.blocking.k_to_u - Free when getting data from FIFO (w/ ctx siwtch) : 1550 cycles , 12922 ns : 273 fifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to FIFO (w/ ctx switch) : 1795 cycles , 14958 ns : 274 fifo.get.free.blocking.u_to_u - Free when getting data from FIFO (w/ ctx siwtch) : 2084 cycles , 17374 ns : 275 fifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to FIFO (w/ ctx switch) : 2135 cycles , 17791 ns : 276 lifo.put.immediate.kernel - Add data to LIFO (no ctx switch) : 234 cycles , 1957 ns : 277 lifo.get.immediate.kernel - Get data from LIFO (no ctx switch) : 189 cycles , 1582 ns : 278 lifo.put.alloc.immediate.kernel - Allocate to add data to LIFO (no ctx switch) : 935 cycles , 7791 ns : 279 lifo.get.free.immediate.kernel - Free when getting data from LIFO (no ctx switch) : 650 cycles , 5416 ns : 280 lifo.put.alloc.immediate.user - Allocate to add data to LIFO (no ctx switch) : 1715 cycles , 14291 ns : 281 lifo.get.free.immediate.user - Free when getting data from LIFO (no ctx switch) : 1445 cycles , 12041 ns : 282 lifo.get.blocking.k_to_k - Get data from LIFO (w/ ctx switch) : 1219 cycles , 10166 ns : 283 lifo.put.wake+ctx.k_to_k - Add data to LIFO (w/ ctx switch) : 1230 cycles , 10250 ns : 284 lifo.get.free.blocking.k_to_k - Free when getting data from LIFO (w/ ctx switch) : 1210 cycles , 10083 ns : 285 lifo.put.alloc.wake+ctx.k_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 1260 cycles , 10500 ns : 286 lifo.get.free.blocking.u_to_k - Free when getting data from LIFO (w/ ctx switch) : 1744 cycles , 14541 ns : 287 lifo.put.alloc.wake+ctx.k_to_u - Allocate to add data to LIFO (w/ ctx siwtch) : 1595 cycles , 13291 ns : 288 lifo.get.free.blocking.k_to_u - Free when getting data from LIFO (w/ ctx switch) : 1544 cycles , 12874 ns : 289 lifo.put.alloc.wake+ctx.u_to_k - Allocate to add data to LIFO (w/ ctx siwtch) : 1795 cycles , 14958 ns : 290 lifo.get.free.blocking.u_to_u - Free when getting data from LIFO (w/ ctx switch) : 2080 cycles , 17339 ns : 291 lifo.put.alloc.wake+ctx.u_to_u - Allocate to add data to LIFO (w/ ctx siwtch) : 2130 cycles , 17750 ns : 292 events.post.immediate.kernel - Post events (nothing wakes) : 285 cycles , 2375 ns : 293 events.set.immediate.kernel - Set events (nothing wakes) : 290 cycles , 2417 ns : 294 events.wait.immediate.kernel - Wait for any events (no ctx switch) : 235 cycles , 1958 ns : 295 events.wait_all.immediate.kernel - Wait for all events (no ctx switch) : 245 cycles , 2042 ns : 296 events.post.immediate.user - Post events (nothing wakes) : 800 cycles , 6669 ns : 297 events.set.immediate.user - Set events (nothing wakes) : 811 cycles , 6759 ns : 298 events.wait.immediate.user - Wait for any events (no ctx switch) : 780 cycles , 6502 ns : 299 events.wait_all.immediate.user - Wait for all events (no ctx switch) : 770 cycles , 6419 ns : 300 events.wait.blocking.k_to_k - Wait for any events (w/ ctx switch) : 1210 cycles , 10089 ns : 301 events.set.wake+ctx.k_to_k - Set events (w/ ctx switch) : 1449 cycles , 12082 ns : 302 events.wait_all.blocking.k_to_k - Wait for all events (w/ ctx switch) : 1250 cycles , 10416 ns : 303 events.post.wake+ctx.k_to_k - Post events (w/ ctx switch) : 1475 cycles , 12291 ns : 304 events.wait.blocking.u_to_k - Wait for any events (w/ ctx switch) : 1612 cycles , 13435 ns : 305 events.set.wake+ctx.k_to_u - Set events (w/ ctx switch) : 1627 cycles , 13560 ns : 306 events.wait_all.blocking.u_to_k - Wait for all events (w/ ctx switch) : 1785 cycles , 14875 ns : 307 events.post.wake+ctx.k_to_u - Post events (w/ ctx switch) : 1790 cycles , 14923 ns : 308 events.wait.blocking.k_to_u - Wait for any events (w/ ctx switch) : 1407 cycles , 11727 ns : 309 events.set.wake+ctx.u_to_k - Set events (w/ ctx switch) : 1828 cycles , 15234 ns : 310 events.wait_all.blocking.k_to_u - Wait for all events (w/ ctx switch) : 1585 cycles , 13208 ns : 311 events.post.wake+ctx.u_to_k - Post events (w/ ctx switch) : 2000 cycles , 16666 ns : 312 events.wait.blocking.u_to_u - Wait for any events (w/ ctx switch) : 1810 cycles , 15087 ns : 313 events.set.wake+ctx.u_to_u - Set events (w/ ctx switch) : 2004 cycles , 16705 ns : 314 events.wait_all.blocking.u_to_u - Wait for all events (w/ ctx switch) : 2120 cycles , 17666 ns : 315 events.post.wake+ctx.u_to_u - Post events (w/ ctx switch) : 2315 cycles , 19291 ns : 316 semaphore.give.immediate.kernel - Give a semaphore (no waiters) : 125 cycles , 1042 ns : 317 semaphore.take.immediate.kernel - Take a semaphore (no blocking) : 125 cycles , 1042 ns : 318 semaphore.give.immediate.user - Give a semaphore (no waiters) : 645 cycles , 5377 ns : 319 semaphore.take.immediate.user - Take a semaphore (no blocking) : 680 cycles , 5669 ns : 320 semaphore.take.blocking.k_to_k - Take a semaphore (context switch) : 1140 cycles , 9500 ns : 321 semaphore.give.wake+ctx.k_to_k - Give a semaphore (context switch) : 1174 cycles , 9791 ns : 322 semaphore.take.blocking.k_to_u - Take a semaphore (context switch) : 1350 cycles , 11251 ns : 323 semaphore.give.wake+ctx.u_to_k - Give a semaphore (context switch) : 1542 cycles , 12852 ns : 324 semaphore.take.blocking.u_to_k - Take a semaphore (context switch) : 1512 cycles , 12603 ns : 325 semaphore.give.wake+ctx.k_to_u - Give a semaphore (context switch) : 1382 cycles , 11519 ns : 326 semaphore.take.blocking.u_to_u - Take a semaphore (context switch) : 1723 cycles , 14360 ns : 327 semaphore.give.wake+ctx.u_to_u - Give a semaphore (context switch) : 1749 cycles , 14580 ns : 328 condvar.wait.blocking.k_to_k - Wait for a condvar (context switch) : 1285 cycles , 10708 ns : 329 condvar.signal.wake+ctx.k_to_k - Signal a condvar (context switch) : 1315 cycles , 10964 ns : 330 condvar.wait.blocking.k_to_u - Wait for a condvar (context switch) : 1547 cycles , 12898 ns : 331 condvar.signal.wake+ctx.u_to_k - Signal a condvar (context switch) : 1855 cycles , 15458 ns : 332 condvar.wait.blocking.u_to_k - Wait for a condvar (context switch) : 1990 cycles , 16583 ns : 333 condvar.signal.wake+ctx.k_to_u - Signal a condvar (context switch) : 1640 cycles , 13666 ns : 334 condvar.wait.blocking.u_to_u - Wait for a condvar (context switch) : 2313 cycles , 19280 ns : 335 condvar.signal.wake+ctx.u_to_u - Signal a condvar (context switch) : 2170 cycles , 18083 ns : 336 stack.push.immediate.kernel - Add data to k_stack (no ctx switch) : 189 cycles , 1582 ns : 337 stack.pop.immediate.kernel - Get data from k_stack (no ctx switch) : 194 cycles , 1624 ns : 338 stack.push.immediate.user - Add data to k_stack (no ctx switch) : 679 cycles , 5664 ns : 339 stack.pop.immediate.user - Get data from k_stack (no ctx switch) : 1014 cycles , 8455 ns : 340 stack.pop.blocking.k_to_k - Get data from k_stack (w/ ctx switch) : 1209 cycles , 10083 ns : 341 stack.push.wake+ctx.k_to_k - Add data to k_stack (w/ ctx switch) : 1235 cycles , 10291 ns : 342 stack.pop.blocking.u_to_k - Get data from k_stack (w/ ctx switch) : 2050 cycles , 17089 ns : 343 stack.push.wake+ctx.k_to_u - Add data to k_stack (w/ ctx switch) : 1575 cycles , 13125 ns : 344 stack.pop.blocking.k_to_u - Get data from k_stack (w/ ctx switch) : 1549 cycles , 12916 ns : 345 stack.push.wake+ctx.u_to_k - Add data to k_stack (w/ ctx switch) : 1755 cycles , 14625 ns : 346 stack.pop.blocking.u_to_u - Get data from k_stack (w/ ctx switch) : 2389 cycles , 19916 ns : 347 stack.push.wake+ctx.u_to_u - Add data to k_stack (w/ ctx switch) : 2095 cycles , 17458 ns : 348 mutex.lock.immediate.recursive.kernel - Lock a mutex : 165 cycles , 1375 ns : 349 mutex.unlock.immediate.recursive.kernel - Unlock a mutex : 80 cycles , 668 ns : 350 mutex.lock.immediate.recursive.user - Lock a mutex : 685 cycles , 5711 ns : 351 mutex.unlock.immediate.recursive.user - Unlock a mutex : 615 cycles , 5128 ns : 352 heap.malloc.immediate - Average time for heap malloc : 626 cycles , 5224 ns : 353 heap.free.immediate - Average time for heap free : 427 cycles , 3558 ns : 354 =================================================================== 355 PROJECT EXECUTION SUCCESSFUL 356 357The sample output above (userspace enabled) shows longer times than for 358the default build scenario. Enabling userspace results in additional 359runtime overhead on each call to a kernel object to determine whether the 360caller is in user or kernel space and consequently whether a system call 361is needed or not. 362