FirmwareMaestro Docs
Edge AI

Edge AI on Nordic

On-device machine learning with Nordic Semiconductor's Axon NPU, the nRF54LM20B SoC, and the Edge AI Add-on for nRF Connect SDK.

Nordic Semiconductor brought hardware-accelerated machine learning to the ultra-low-power BLE world in early 2026 with the nRF54LM20B — the first Nordic SoC to integrate a dedicated Axon Neural Processing Unit (NPU). This section is your guide to building production firmware that runs on-device inference (keyword spotting, audio classification, sensor analytics, anomaly detection) without round-tripping to the cloud.

The Axon NPU originated from Nordic's 2023 acquisition of Atlazo (San Diego) — a startup specialising in always-on AI processors and energy management for tiny edge devices. The architecture has since been integrated into Nordic's own silicon and is being rolled out across the wireless portfolio (next stop: the nRF92 cellular-IoT modules).

What you'll find here

Why on-device inference matters for nRF designs

  • Latency — Wake-word detection, gesture recognition, and keyword spotting need millisecond-class response times that a round trip to a cloud model cannot meet.
  • Power — Keeping the radio off most of the time and waking on a local inference event is dramatically cheaper than streaming sensor data.
  • Privacy — Audio and biometric data stay on the device.
  • Connectivity independence — The product still works when Wi-Fi or cellular is unavailable.

The Axon NPU lets you do all of this inside the same coin-cell-class power budget that defines the nRF54L Series — something traditional Cortex-M-only inference workloads typically can't sustain.

How FirmwareMaestro fits in

FirmwareMaestro generates Zephyr-native scaffolds that already wire up the nRF Connect SDK toolchain. For Edge AI projects, that means:

  • A prj.conf that pulls in the Edge AI Add-on (nrf_edgeai_lib) and the Axon NPU drivers
  • A Devicetree overlay for the on-board microphone / IMU / sensor source
  • A main.c skeleton that initialises the NPU, loads a compiled model header, and runs an inference loop in a dedicated Zephyr work queue
  • Generated PRD, architecture, and HAL documents that explicitly call out the inference budget, model footprint, and DSP pre-processing path

Start with the Getting Started workflow.

On this page