1# Arm(R) Ethos(TM)-U core driver
2
3This repository contains a device driver for the Arm(R) Ethos(TM)-U NPU.
4
5## Building
6
7The source code comes with a CMake based build system. The driver is expected to
8be cross compiled for any of the supported Arm Cortex(R)-M CPUs, which requires
9the user to configure the build to match their system configuration.
10
11
12One such requirement is to define the target CPU, normally by setting
13`CMAKE_SYSTEM_PROCESSOR`. **Note** that when using the toolchain files provided
14in [core_platform](https://git.mlplatform.org/ml/ethos-u/ethos-u-core-platform.git),
15the variable `TARGET_CPU` must be used instead of `CMAKE_SYSTEM_PROCESSOR`.
16
17Target CPU is specified on the form "cortex-m<nr><features>", for example:
18"cortex-m55+nodsp+nofp".
19
20Similarly the target NPU configuration is
21controlled by setting `ETHOSU_TARGET_NPU_CONFIG`, for example "ethos-u55-128".
22
23The build configuration can be defined either in the toolchain file or
24by passing options on the command line.
25
26```[bash]
27$ cmake -B build  \
28    -DCMAKE_TOOLCHAIN_FILE=<toolchain> \
29    -DCMAKE_SYSTEM_PROCESSOR=cortex-m<nr><features> \
30    -DETHOSU_TARGET_NPU_CONFIG=ethos-u<nr>-<macs>
31$ cmake --build build
32```
33
34or when using toolchain files from [core_platform](https://git.mlplatform.org/ml/ethos-u/ethos-u-core-platform.git)
35
36```[bash]
37$ cmake -B build  \
38    -DCMAKE_TOOLCHAIN_FILE=<core_platform_toolchain> \
39    -DTARGET_CPU=cortex-m<nr><features> \
40    -DETHOSU_TARGET_NPU_CONFIG=ethos-u<nr>-<macs>
41$ cmake --build build
42```
43
44## Driver APIs
45
46The driver APIs are defined in `include/ethosu_driver.h` and the related types
47in `include/ethosu_types.h`. Inferences can be invoked in two manners:
48synchronously or asynchronously. The two types of invocation can be freely mixed
49in a single application.
50
51### Synchronous invocation
52
53A typical usage of the driver can be the following:
54
55```[C]
56// reserve a driver to be used (this call could block until a driver is available)
57struct ethosu_driver *drv = ethosu_reserve_driver();
58...
59// run one or more inferences
60int result = ethosu_invoke(drv,
61                           custom_data_ptr,
62                           custom_data_size,
63                           base_addr,
64                           base_addr_size,
65                           num_base_addr);
66...
67// release the driver for others to use
68ethosu_release_driver(drv);
69```
70
71### Asynchronous invocation
72
73A typical usage of the driver can be the following:
74
75```[C]
76// reserve a driver to be used (this call could block until a driver is available)
77struct ethosu_driver *drv = ethosu_reserve_driver();
78...
79// run one or more inferences
80int result = ethosu_invoke_async(drv,
81                                 custom_data_ptr,
82                                 custom_data_size,
83                                 base_addr,
84                                 base_addr_size,
85                                 num_base_addr,
86                                 user_arg);
87...
88// do some other work
89...
90int ret;
91do {
92    // true = blocking, false = non-blocking
93    // ret > 0 means inference not completed (only for non-blocking mode)
94    ret = ethosu_wait(drv, <true|false>);
95} while(ret > 0);
96...
97// release the driver for others to use
98ethosu_release_driver(drv);
99```
100
101Note that if `ethosu_wait` is invoked from a different thread and concurrently
102with `ethosu_invoke_async`, the user is responsible to guarantee that
103`ethosu_wait` is called after a successful completion of `ethosu_invoke_async`.
104Otherwise `ethosu_wait` might fail and not actually wait for the inference
105completion.
106
107### Driver initialization
108
109In order to use a driver it first needs to be initialized by calling the `init`
110function, which will also register the handle in the list of available drivers.
111A driver can be torn down by using the `deinit` function, which also removes the
112driver from the list.
113
114The correct mapping is one driver per NPU device. Note that the NPUs must have
115the same configuration, indeed the NPU configuration can be only one, which is
116defined at compile time.
117
118## Implementation design
119
120The driver is structured in two main parts: the driver, which is responsible to
121provide an unified API to the user; and the device part, which deals with the
122details at the hardware level.
123
124In order to do its task the driver needs a device implementation. There could be
125multiple device implementation for different hardware model and/or
126configurations. Note that the driver can be compiled to target only one NPU
127configuration by specializing the device part at compile time.
128
129## Data caching
130
131For running the driver on Arm CPUs which are configured with data cache, the
132cache maintenance functions in the driver are exported with weakly linked
133symbols that should be overridden. An example implementation using the CMSIS
134primitives found in cachel1_armv7.h could be as below:
135
136```[C++]
137extern "C" {
138void ethosu_flush_dcache(uint32_t *p, size_t bytes) {
139    if (p)
140        SCB_CleanDCache_by_Addr(p, bytes);
141    else
142        SCB_CleanDCache();
143}
144
145void ethosu_invalidate_dcache(uint32_t *p, size_t bytes) {
146    if (p)
147        SCB_InvalidateDCache_by_Addr(p, bytes);
148    else
149        SCB_InvalidateDCache();
150}
151}
152```
153
154## Mutex and semaphores
155
156To ensure the correct functionality of the driver mutexes and semaphores are
157used internally. The default implementations of mutexes and semaphores are
158designed for a single-threaded baremetal environment. Hence for integration in
159environemnts where multi-threading is possible, e.g., RTOS, the user is
160responsible to provide implementation for mutexes and semaphores to be used by
161the driver.
162
163The mutex and semaphore APIs are defined as weak linked functions that can be
164overridden by the user. The APIs are the usual ones and described below:
165
166```[C]
167// create a mutex by returning back a handle
168void *ethosu_mutex_create(void);
169// lock the given mutex
170void ethosu_mutex_lock(void *mutex);
171// unlock the given mutex
172void ethosu_mutex_unlock(void *mutex);
173
174// create a (binary) semaphore by returning back a handle
175void *ethosu_semaphore_create(void);
176// take from the given semaphore
177void ethosu_semaphore_take(void *sem);
178// give from the given semaphore
179void ethosu_semaphore_give(void *sem);
180```
181
182## Begin/End inference callbacks
183
184The driver provide weak linked functions as hooks to receive callbacks whenever
185an inference begins and ends. The user can override such functions when needed.
186To avoid memory leaks, any allocations done in the ethosu_inference_begin() must
187be balanced by a corresponding free of the memory in the ethosu_inference_end()
188callback.
189
190```[C]
191void ethosu_inference_begin(struct ethosu_driver *drv, void *user_arg);
192void ethosu_inference_end(struct ethosu_driver *drv, void *user_arg);
193```
194
195Note that the `void *user_arg` pointer passed to invoke() function is the same
196pointer passed to the begin() and end() callbacks. For example:
197
198```[C]
199void my_function() {
200    ...
201    struct my_data data = {...};
202    int result = int ethosu_invoke_v3(drv,
203                                  custom_data_ptr,
204                                  custom_data_size,
205                                  base_addr,
206                                  base_addr_size,
207                                  num_base_addr,
208                                  (void *)&data);
209    ....
210}
211
212void ethosu_inference_begin(struct ethosu_driver *drv, void *user_arg) {
213        struct my_data *data = (struct my_data*) user_arg;
214        // use drv and data here
215}
216
217void ethosu_inference_end(struct ethosu_driver *drv, void *user_arg) {
218        struct my_data *data = (struct my_data*) user_arg;
219        // use drv and data here
220}
221```
222
223## License
224
225The Arm Ethos-U core driver is provided under an Apache-2.0 license. Please see
226[LICENSE.txt](LICENSE.txt) for more information.
227
228## Contributions
229
230The Arm Ethos-U project welcomes contributions under the Apache-2.0 license.
231
232Before we can accept your contribution, you need to certify its origin and give
233us your permission. For this process we use the Developer Certificate of Origin
234(DCO) V1.1 (https://developercertificate.org).
235
236To indicate that you agree to the terms of the DCO, you "sign off" your
237contribution by adding a line with your name and e-mail address to every git
238commit message. You must use your real name, no pseudonyms or anonymous
239contributions are accepted. If there are more than one contributor, everyone
240adds their name and e-mail to the commit message.
241
242```[]
243Author: John Doe \<john.doe@example.org\>
244Date:   Mon Feb 29 12:12:12 2016 +0000
245
246Title of the commit
247
248Short description of the change.
249
250Signed-off-by: John Doe john.doe@example.org
251Signed-off-by: Foo Bar foo.bar@example.org
252```
253
254The contributions will be code reviewed by Arm before they can be accepted into
255the repository.
256
257## Security
258
259Please see [Security](SECURITY.md).
260
261## Trademark notice
262
263Arm, Cortex and Ethos are registered trademarks of Arm Limited (or its
264subsidiaries) in the US and/or elsewhere.
265