1# Overview {#mainpage} 2 3## Introduction 4 5This user manual describes the CMSIS DSP software library, a suite of common compute processing functions for use on Cortex-M and Cortex-A processor based devices. 6 7The library is divided into a number of functions each covering a specific category: 8 9 - \ref groupMath "Basic math functions" 10 - \ref groupFastMath "Fast math functions" 11 - \ref groupCmplxMath "Complex math functions" 12 - \ref groupFilters "Filtering functions" 13 - \ref groupMatrix "Matrix functions" 14 - \ref groupTransforms "Transform functions" 15 - \ref groupController "Motor control functions" 16 - \ref groupStats "Statistical functions" 17 - \ref groupSupport "Support functions" 18 - \ref groupInterpolation "Interpolation functions" 19 - \ref groupSVM "Support Vector Machine functions (SVM)" 20 - \ref groupBayes "Bayes classifier functions" 21 - \ref groupDistance "Distance functions" 22 - \ref groupQuaternionMath "Quaternion functions" 23 - \ref groupWindow "Window functions" 24 25The library has generally separate functions for operating on 8-bit integers, 16-bit integers, 32-bit integer and 32-bit floating-point values and 64-bit floating-point values. 26 27The library is providing vectorized versions of most algorithms for Helium and of most f32 algorithms for Neon. 28 29When using a vectorized version, provide a little bit of padding after the end of a buffer (3 words) because the vectorized code may read a little bit after the end of a buffer. You don't have to modify your buffers but just ensure that the end of buffer + padding is not outside of a memory region. 30 31## Related projects 32 33### Python wrapper 34 35A Python wrapper is also available with a Python API as close as possible to the C one. It can be used to start developing and testing an algorithm with NumPy and SciPy before writing the C version. Is is available on [PyPI.org](https://pypi.org/project/cmsisdsp/). It can be installed with: `pip install cmsisdsp`. 36 37### Experimental C++ template extension 38 39This extension is a set of C++ headers. They just need to included to start using the features. 40 41Those headers are not yet part of the pack and you need to get them from the [github repository](https://github.com/ARM-software/CMSIS-DSP/tree/main/dsppp/Include) 42 43More documentation about the @ref dsppp_main "DSP++" extension. 44 45## Using the CMSIS-DSP Library {#using} 46 47The library is released in source form. It is strongly advised to compile the library using `-Ofast` optimization to have the best performances. 48 49Following options should be avoided: 50 51* `-fno-builtin` 52* `-ffreestanding` because it enables previous options 53 54The library is doing some type [punning](https://en.wikipedia.org/wiki/Type_punning) to process word 32 from memory as a pair of `q15` or a quadruple of `q7`. Those type manipulations are done through `memcpy` functions. Most compilers should be able to optimize out those function calls when the length to copy is small (4 bytes). 55 56This optimization will **not** occur when `-fno-builtin` is used and it will have a **very bad** impact on the performances. 57 58 59The library functions are declared in the public file `Include/arm_math.h`. Simply include this file to use the CMSIS-DSP library. If you don't want to include everything, you can also rely on individual header files from the `Include/dsp/` folder and include only those that are needed in the project. 60 61## Examples {#example} 62 63The library ships with a number of examples which demonstrate how to use the library functions. Please refer to \ref groupExamples. 64 65## Toolchain Support {#toolchain} 66 67The library is now tested on Fast Models building with cmake. Core M0, M4, M7, M33, M55 are tested. 68 69## Access to CMSIS-DSP {#pack} 70 71CMSIS-DSP is actively maintained in the [**CMSIS-DSP GitHub repository**](https://github.com/ARM-software/CMSIS-DSP) and is released as a standalone [**CMSIS-DSP pack**](https://www.keil.arm.com/packs/cmsis-dsp-arm/versions/) in the [CMSIS-Pack format](https://open-cmsis-pack.github.io/Open-CMSIS-Pack-Spec/main/html/index.html). 72 73The table below explains the content of **ARM::CMSIS-DSP** pack. 74 75 Directory | Description 76:--------------------------------------|:------------------------------------------------------ 77 ComputeLibrary | Small Neon kernels when building on Cortex-A 78 Documentation | Folder with this CMSIS-DSP documenation 79 Example | Example projects demonstrating the usage of the library functions 80 Include | Include files for using and building the lib 81 PrivateInclude | Private include files for building the lib 82 Source | Source files 83 ARM.CMSIS-DSP.pdsc | CMSIS-Pack description file 84 LICENSE | License Agreement (Apache 2.0) 85 86See [CMSIS Documentation](https://arm-software.github.io/CMSIS_6/) for an overview of CMSIS software components, tools and specifications. 87 88 89## Preprocessor Macros {#preprocessor} 90 91Each library project has different preprocessor macros. 92 93 - `ARM_MATH_BIG_ENDIAN`: 94 - Define macro ARM_MATH_BIG_ENDIAN to build the library for big endian targets. By default library builds for little endian targets. 95 96 - `ARM_MATH_MATRIX_CHECK`: 97 - Define macro ARM_MATH_MATRIX_CHECK for checking on the input and output sizes of matrices 98 99 - `ARM_MATH_ROUNDING`: 100 - Define macro ARM_MATH_ROUNDING for rounding on support functions 101 102 - `ARM_MATH_LOOPUNROLL`: 103 - Define macro ARM_MATH_LOOPUNROLL to enable manual loop unrolling in DSP functions 104 105 - `ARM_MATH_NEON`: 106 - Define macro ARM_MATH_NEON to enable Neon versions of the DSP functions. It is not enabled by default when Neon is available because performances are dependent on the compiler and target architecture. 107 108 - `ARM_MATH_NEON_EXPERIMENTAL`: 109 - Define macro ARM_MATH_NEON_EXPERIMENTAL to enable experimental Neon versions of of some DSP functions. Experimental Neon versions currently do not have better performances than the scalar versions. 110 111 - `ARM_MATH_HELIUM`: 112 - It implies the flags ARM_MATH_MVEF and ARM_MATH_MVEI and ARM_MATH_MVE_FLOAT16. 113 114 - `ARM_MATH_HELIUM_EXPERIMENTAL`: 115 - Only taken into account when ARM_MATH_MVEF, ARM_MATH_MVEI or ARM_MATH_MVE_FLOAT16 are defined. Enable some vector versions which may have worse performance than scalar depending on the core / compiler configuration. 116 117 - `ARM_MATH_MVEF`: 118 - Select Helium versions of the f32 algorithms. It implies ARM_MATH_FLOAT16 and ARM_MATH_MVEI. 119 120 - `ARM_MATH_MVEI`: 121 - Select Helium versions of the int and fixed point algorithms. 122 123 - `ARM_MATH_MVE_FLOAT16`: 124 - MVE Float16 implementations of some algorithms (Requires MVE extension). 125 126 - `DISABLEFLOAT16`: 127 - Disable float16 algorithms when __fp16 is not supported for a specific compiler / core configuration. This is only valid for scalar. When vector architecture is supporting f16 then it can't be disabled. 128 129 - `ARM_MATH_AUTOVECTORIZE`: 130 - With Helium or Neon, disable the use of vectorized code with C intrinsics and use pure C instead. The vectorization is then done by the compiler. 131 132 - `ARM_DSP_ATTRIBUTE`: Can be set to define CMSIS-DSP function as weak functions. This can either be set on the command line when building or in a new `arm_dsp_config.h` header (see below) 133 134 - `ARM_DSP_TABLE_ATTRIBUTE`: Can be set to define in which section constant tables must be mapped. This can either be set on the command line when building or in a new `arm_dsp_config.h` header (see below). Another way to set those sections is by modifying the linker scripts since the constant tables are defined only in a restricted set of source files. 135 136 - `ARM_DSP_CUSTOM_CONFIG` When set, the file `arm_dsp_config.h` is included by the `arm_math_types.h` headers. You can use this file to define any of the above compilation symbols. 137 138## Code size 139 140Previous versions were using lots of compilation flags to control code size. It was enabled with `ARM_DSP_CONFIG_TABLES`. It was getting too complex and has been removed. Now code size optimizations are relying on the linker. 141 142You no more need to use any compilation flags like `ARM_TABLE_TWIDDLECOEF_F32_2048`, `ARM_FFT_ALLOW_TABLES` etc ... 143 144They have been removed. 145 146Constant tables can use a lot of read only memory but the linker can remove the unused functions and constant tables if it can deduce that those tables or functions are not used. 147 148For this you need to use the right initialization functions in the library and the right options for the linker (they are compiler dependent). 149 150For all transforms functions (CFFT, RFFT ...) instead of using a generic initialization function that works for all lengths (like `arm_cfft_init_f32`), use a dedicated initialization function for a specific size (like `arm_cfft_init_1024_f32`). 151 152By using the right initialization function, you're telling the linker what is really used. 153 154If you use a generic function, the linker cannot deduce the used lengths and thus will keep all the constant tables required for each length. 155 156Then you need to use the right options for the compiler so that the unused tables and functions are removed. It is compiler dependent but generally the options are named like `-ffunction-sections`, `-fdata-sections`, `--gc-sections` ... 157 158## Variations between the architectures 159 160Some algorithms may give slightlty different results on different architectures (like M0 or M4/M7 or M55). It is a tradeoff made for speed reasons and to make best use of the different instruction sets. 161 162All algorithms are compared with a double precision reference and the different versions (for different architectures) have the same characteristics when compared to the double precision (SNR bound, max bound for sample error ...) 163 164As consequence, the small differences that may exists between the different architecture implementations should be too small to have any practical consequences. 165 166 167 168## License {#license} 169 170The CMSIS-DSP is provided free of charge under the [Apache 2.0 License](https://raw.githubusercontent.com/ARM-software/CMSIS-DSP/main/LICENSE). 171