Axon NPU Architecture
Nordic's in-house ultra-low-power neural processing unit — 128 MHz, 3–8 GOPS, up to 15× faster than CPU TensorFlow Lite execution.
The Axon NPU is a dedicated hardware accelerator for neural-network inference, designed specifically for the always-on, battery-powered constraints of the nRF54L Series. It is built into the silicon alongside the Cortex-M33 application core — not a separate companion chip — so it shares SRAM and is woken / fed by the main CPU through low-overhead drivers.
Key numbers
| Parameter | Value |
|---|---|
| Origin | Atlazo (acquired by Nordic, 2023, San Diego) |
| Clock | 128 MHz |
| Throughput | 3–8 GOPS, workload-dependent |
| Speedup vs Cortex-M33 | up to 15× on TensorFlow Lite / LiteRT models |
| Performance/watt vs competing edge AI | up to 7× higher performance, 8× better energy efficiency |
| First host SoC | nRF54LM20B |
| Future hosts | nRF92 Series (cellular IoT), additional wireless SoCs |
"Up to 15×" and "up to 7×/8×" are Nordic-reported figures and are workload-dependent — they assume a model that fits the natively accelerated op set (see below) and uses INT8 quantisation, and the competitive comparison depends on the reference part Nordic chose. Models that fall back to CPU execution will see a smaller speedup. Always benchmark on your own model with the Axon Compiler's per-layer report before sizing a power budget around these numbers.
Natively accelerated operations
The Axon NPU accelerates the operations that dominate inference time in practical edge models:
- 1D and 2D convolutions
- Depthwise convolutions
- Fully connected (dense) layers
- Pooling layers
- Activation functions (the common ones used in quantised models)
Operations outside this set fall back to the Cortex-M33. The Axon Compiler reports exactly which layers can be hardware-accelerated and which will run on the CPU, including a per-layer inference-time estimate.
Quantisation
The NPU is designed for INT8 quantised models. This is the format TensorFlow Lite Micro / LiteRT produces by default with full integer quantisation, and it's what you'll get out of Edge Impulse and the Nordic Edge AI Lab. Float models can be compiled but will not see the full speedup or energy benefit.
The Axon Compiler reports quantisation loss as part of its metrics so you can decide whether the accuracy hit is acceptable for your application.
What it's good for
The Axon NPU was designed for the kind of always-on workloads that have historically been the bottleneck for battery-powered IoT:
- Keyword spotting (KWS) and wake-word detection — running a small audio model continuously without draining a coin cell
- Audio classification — door knocks, glass break, cough, alarm sounds
- Anomaly detection on sensor streams — vibration, current, pressure
- Gesture recognition — accelerometer / IMU based
- Health-signal classification — PPG, ECG (with appropriate clinical validation; see Medical guidance)
For simpler, very-low-rate sensor analytics, Nordic also supports Neuton custom models — ultra-efficient CPU-run models that don't need the NPU at all.
What it isn't
The Axon NPU is not a general-purpose GPU and will not run modern transformer / LLM workloads. The energy and area budget targets "always-on, sub-1 mW inference" — measured in microjoules per inference, not "frames per second of ResNet-50". Pick the right tool for the job:
| Workload | Right target |
|---|---|
| Wake-word, KWS, IMU classification, anomaly detection | Axon NPU on nRF54LM20B |
| Vibration / accelerometer analytics with tiny models | Neuton custom models on any nRF54L (CPU) |
| Vision, transformer inference, on-device LLMs | A different class of part — not Nordic's nRF Series |
Where to next
- Start building: Development tools
- Pick the host part: nRF54LM20B SoC
- Walk through a project: Getting started
nRF54LM20B SoC
The first Nordic SoC with an integrated Axon NPU — 128 MHz Arm Cortex-M33, 2 MB NVM, 512 KB RAM, BLE 6.0 / Matter / Thread / Zigbee, and high-speed USB.
Edge AI Development Tools
Edge AI Add-on v2.0 for nRF Connect SDK, the Axon Compiler, Nordic Edge AI Lab, Edge Impulse integration, and Neuton custom models.