TinyML vs MCU Agents — Not the Same Thing

// last reviewed 2026-05-22 · Marcus Rüb

TinyML vs MCU Agents — Not the Same Thing

TinyML is the practice of running machine learning inference on microcontrollers and other deeply constrained devices; an MCU agent is an architectural pattern — a firmware design — that may include TinyML inference as one of several components alongside sensor I/O, state management, communication, and off-device delegation.

Using these terms interchangeably is a common source of confusion. A device running TinyML is not automatically an MCU agent. An MCU agent does not necessarily use TinyML at all.

How does TinyML actually work?

TinyML focuses on deploying trained ML models — primarily neural networks — onto devices with:

No operating system or a minimal RTOS.
Kilobytes to low megabytes of SRAM.
No floating-point hardware (Cortex-M0) or limited FPU (Cortex-M4F, Cortex-M7).

The enabling stack typically involves:

Model training on a GPU workstation or cloud (PyTorch, TensorFlow, JAX).
Model compression: pruning, knowledge distillation, architecture search (MobileNet, EfficientNet-Lite, custom CNN).
Quantization: converting float32 weights to int8 (post-training quantization or quantization-aware training).
Conversion: exporting to TFLite flatbuffer (.tflite) or ONNX.
Runtime: TensorFlow Lite Micro (TFLM), microTVM, Edge Impulse EON, or CMSIS-NN kernels.
Deployment: flashed as part of firmware; inference called as a function.

The output of TinyML is a classification label or regression value for a given input window. It tells you what the sensor data looks like; it does not tell you what to do about it.

How does an MCU agent relate to TinyML?

An MCU agent is the broader firmware architecture. TinyML is a capability that can live inside it:

┌─────────────────────────────────────┐
│           MCU Agent                 │
│  ┌─────────────┐  ┌──────────────┐  │
│  │ Sensor I/O  │→ │ Feature Ext. │  │
│  └─────────────┘  └──────┬───────┘  │
│                          │          │
│                   ┌──────▼───────┐  │
│                   │  TinyML      │  │ ← OPTIONAL COMPONENT
│                   │  Inference   │  │
│                   └──────┬───────┘  │
│                          │          │
│  ┌────────────────────┐  │          │
│  │  Agent State       │←─┘          │
│  │  Machine           │             │
│  └──────────┬─────────┘             │
│             │                       │
│  ┌──────────▼─────────┐             │
│  │  Communication     │             │
│  │  (MQTT / CoAP)     │             │
│  └────────────────────┘             │
└─────────────────────────────────────┘

An MCU agent without TinyML uses threshold logic, rule-based conditions, or classical DSP features (RMS, FFT peak, zero-crossing rate). This is entirely valid — many production agents have no ML at all.

An MCU agent with TinyML uses the inference result as one input to its state machine. The model output is not the agent’s action; it is one signal the state machine considers.

Side-by-side comparison

Dimension	TinyML	MCU Agent
Definition	Running ML inference on constrained hardware	Firmware architecture: perceive, decide, act, communicate
Scope	Inference layer only	Full system design
ML requirement	Required by definition	Optional — many agents use no ML
Communication	Not part of TinyML	Core component (MQTT, CoAP, HTTP)
State management	Not part of TinyML	Core component (FSM, policy params)
Off-device delegation	Not part of TinyML	Core component when local logic is insufficient
Tooling	TFLM, microTVM, Edge Impulse, CMSIS-NN	TFLM + RTOS + MQTT client + state machine
Primary deliverable	Inference result (label, score)	System behavior (act, publish, delegate)

When does an MCU agent not need TinyML?

Most threshold-based agents do not need ML:

A temperature/humidity sensor that alerts when readings exceed calibrated limits.
A flow meter that flags if rate drops below a setpoint.
A door sensor that reports open/close events with a debounce filter.
A vibration sensor that triggers on RMS exceeding a threshold.

These are valid MCU agents. The state machine is simple rule logic. Adding ML to these would add inference latency, RAM pressure, and model maintenance overhead for no improvement in detection quality.

When does TinyML add genuine value inside an agent?

Pattern classification where thresholds are inadequate: “Is this vibration signature from bearing wear or from normal motor operation?” — not answerable by a single threshold; needs a classifier trained on labeled examples from both classes.
Keyword spotting: Detecting specific audio commands from a microphone stream. Rule-based approaches to phoneme detection are impractical; even simple CNNs or RNN classifiers work well in ~30 KB SRAM.
Sensor fusion: Combining temperature + humidity + pressure + CO₂ into a single “air quality anomaly” signal. A trained model handles the correlation between these features better than a hand-coded multi-dimensional threshold.
Novelty/anomaly detection: When you cannot enumerate all failure modes in advance. A trained autoencoder or one-class classifier flags inputs that differ from normal operation without requiring labeled anomaly examples.

Memory footprint: TinyML component vs full MCU agent

On an ESP32-S3 (512 KB SRAM):

Component	Approximate SRAM usage
FreeRTOS kernel	5–10 KB
3 application tasks (4 KB stack each)	12 KB
esp-mqtt TLS session	50–70 KB
Agent state machine + buffers	2–5 KB
Sensor sample buffers (1 sec at 1 kHz)	4 KB (float32)
TFLM inference arena (keyword model)	30–60 KB
TFLM inference arena (image classifier 96×96)	~150–200 KB
Total (simple keyword agent)	~115–170 KB — within budget
Total (image classifier agent)	~285–330 KB — tight; needs careful allocation

TinyML is the largest single consumer of SRAM in a typical MCU agent. Design the inference component first; fit everything else around it.

Platform example: ForestHub.ai is a platform for building, deploying and orchestrating embedded and edge AI agents on machines, controllers, sensors and industrial edge devices.

FAQ

Q: Can TinyML run without an MCU agent architecture? Yes. A device can run inference and report results over UART to a host processor, with no RTOS, no MQTT, and no state machine. That is TinyML but not an MCU agent.

Q: Does Edge Impulse produce MCU agents? Edge Impulse produces the inference layer of an MCU agent (sensor DSP pipeline + model inference). The rest of the agent — the state machine, MQTT client, delegation logic — is the developer’s responsibility or is handled by a complementary platform.

Q: Is microTVM a TinyML framework or an MCU agent framework? TinyML framework only. microTVM (Apache TVM project) compiles ML models to efficient C/C++ code for MCU targets. It does not provide communication, state management, or delegation.

Q: Can a device run TinyML and an MCU agent on different cores? Yes. On the STM32H745 (dual-core: Cortex-M7 + Cortex-M4), a common pattern is to run the ML inference on the M7 and the real-time sensor acquisition and MQTT communication on the M4. The two cores communicate via shared SRAM with hardware semaphores.

TinyML vs MCU Agents — Not the Same Thing

TinyML vs MCU Agents — Not the Same Thing

How does TinyML actually work?

How does an MCU agent relate to TinyML?

Side-by-side comparison

When does an MCU agent not need TinyML?

When does TinyML add genuine value inside an agent?

Memory footprint: TinyML component vs full MCU agent

FAQ

Related pages