Active Inference

A comprehensive introduction to Active Inference and its application to AI agents.

Overview

Active Inference is a unified theory from neuroscience that explains how biological agents perceive, learn, and act. It provides a mathematical framework for building agents that:

Minimize surprise about their sensory observations
Balance exploration (learning) vs exploitation (goal achievement)
Adapt automatically when predictions fail

This document explains the theory and how LRS-Agents implements it.

The Core Principle

Free Energy Minimization

Active Inference is based on the Free Energy Principle, which states:

Biological agents act to minimize their free energy - a measure of surprise about sensory observations.

In mathematical terms:

\[F = -\ln P(o | m)\]

where:

\(F\) = Free energy (surprise)
\(o\) = Observations
\(m\) = Internal model of the world
\(P(o | m)\) = Probability of observations given the model

Lower free energy = Better predictions = Less surprise

Two Ways to Minimize Free Energy

Agents can minimize free energy in two complementary ways:

Perception (Update beliefs)

Change the internal model \(m\) to better explain observations \(o\).

“The world surprised me, so I’ll update my beliefs.”
Action (Change world)

Act to make observations \(o\) more consistent with expectations.

“The world surprised me, so I’ll act to make it match my predictions.”

This is the fundamental loop of Active Inference:

┌─────────────────────────────────────┐
│                                     │
│  ┌──────────┐      ┌──────────┐   │
│  │  BELIEF  │─────→│  ACTION  │   │
│  │  UPDATE  │      │ SELECTION│   │
│  └────▲─────┘      └────┬─────┘   │
│       │                 │          │
│       │                 ▼          │
│  ┌────┴─────┐      ┌──────────┐   │
│  │PREDICTION│◄─────│   WORLD  │   │
│  │  ERROR   │      │   STATE  │   │
│  └──────────┘      └──────────┘   │
│                                     │
└─────────────────────────────────────┘

Active Inference for AI Agents

Traditional RL vs Active Inference

Reinforcement Learning:

Maximize expected reward
Separate exploration and exploitation mechanisms
No explicit uncertainty tracking
Requires manual exploration strategies (ε-greedy, etc.)

Active Inference:

Minimize expected free energy
Exploration and exploitation emerge naturally
Explicit uncertainty (precision) tracking
Automatic adaptation when uncertain

Key Insight

In Active Inference, exploration IS uncertainty reduction.

An agent explores not randomly, but to gain information that reduces uncertainty about the world. This makes exploration principled and efficient.

Mathematical Framework

Generative Model

The agent maintains a generative model \(P(o, s)\) that describes:

\(P(o | s)\) - How observations arise from hidden states
\(P(s)\) - Prior beliefs about states

The agent’s goal is to infer hidden states \(s\) from observations \(o\).

Variational Free Energy

Since exact inference is intractable, we use variational inference:

\[F = D_{KL}[Q(s) || P(s|o)] - \ln P(o)\]

where:

\(Q(s)\) = Approximate posterior (agent’s beliefs)
\(P(s|o)\) = True posterior (unknowable)
\(D_{KL}\) = Kullback-Leibler divergence

Minimizing \(F\) means:

Make \(Q(s)\) close to \(P(s|o)\) (accurate beliefs)
Maximize \(\ln P(o)\) (make observations likely)

Expected Free Energy

For action selection, agents minimize Expected Free Energy \(G\):

\[G(\pi) = \mathbb{E}_{Q(o_\tau | \pi)}[F(o_\tau)] + D_{KL}[Q(s_\tau | \pi) || P(s_\tau | C)]\]

This decomposes into:

\[G(\pi) = \underbrace{\mathbb{E}[H[P(o|s)]]}_{\text{Epistemic value}} - \underbrace{\mathbb{E}[Q(s) \ln P(o|s,C)]}_{\text{Pragmatic value}}\]

where:

Epistemic value: Information gain (exploration)
Pragmatic value: Expected reward (exploitation)

The agent selects policies \(\pi\) that minimize \(G\).

Precision-Weighted Beliefs

Not all beliefs are equally certain. Precision \(\gamma\) weights predictions:

\[F = \gamma \cdot \text{Prediction Error}\]

High precision \(\gamma\) → Trust predictions more (exploit) Low precision \(\gamma\) → Trust predictions less (explore)

Precision dynamics are central to adaptation in LRS-Agents.

How LRS-Agents Implements Active Inference

1. Generative Model

The agent’s generative model consists of:

Tools as actions that change world state
Belief state \(s\) as a dictionary of key-value pairs
Observations \(o\) from tool executions

# Belief state (internal model)
state = {
    'goal': 'fetch_data',
    'api_available': True,
    'cache_available': True,
    'data': None
}

# Tool execution (action) produces observation
observation = tool.get(state)
# observation.value, observation.error, observation.prediction_error

2. Precision Tracking

Precision \(\gamma \in [0, 1]\) represents confidence:

\[\gamma = \frac{\alpha}{\alpha + \beta}\]

Updated via Beta distribution after each observation:

\[\begin{split}\alpha &\leftarrow \alpha + \eta_{gain} \cdot (1 - \delta) \\ \beta &\leftarrow \beta + \eta_{loss} \cdot \delta\end{split}\]

where \(\delta\) is the prediction error.

Asymmetric learning rates:

\(\eta_{gain} = 0.1\) (slow increase)
\(\eta_{loss} = 0.2\) (fast decrease)

This creates optimism bias: Easy to become confident, hard to lose it unless strongly surprised.

from lrs.core.precision import PrecisionParameters

precision = PrecisionParameters()

# Success: slow increase
precision.update(prediction_error=0.1)  # γ: 0.5 → 0.52

# Failure: fast decrease
precision.update(prediction_error=0.9)  # γ: 0.52 → 0.42

3. Expected Free Energy Calculation

For each policy (tool sequence), calculate:

Epistemic Value (Information gain):

\[\text{Epistemic} = \sum_{t} H[P(o_t | s_t)]\]

Higher for novel/uncertain tools.

Pragmatic Value (Expected reward):

\[\text{Pragmatic} = \sum_{t} \gamma^t [P(\text{success}) \cdot R_{\text{success}} + P(\text{fail}) \cdot R_{\text{fail}}]\]

Higher for reliable tools with good outcomes.

Total G:

\[G = \text{Epistemic} - \text{Pragmatic}\]

Policies with lower \(G\) are preferred.

from lrs.core.free_energy import calculate_expected_free_energy

G = calculate_expected_free_energy(
    policy=[tool_a, tool_b],
    state=current_state,
    preferences={'success': 5.0, 'error': -3.0},
    historical_stats=registry.statistics
)

4. Policy Selection

Policies are selected via precision-weighted softmax:

\[P(\pi_i) = \frac{\exp(-\beta \cdot G_i)}{\sum_j \exp(-\beta \cdot G_j)}\]

where inverse temperature:

\[\beta = \frac{1}{T \cdot (1 - \gamma + \epsilon)}\]

High \(\gamma\) → Low temperature → Deterministic (exploit best policy) Low \(\gamma\) → High temperature → Stochastic (explore alternatives)

from lrs.core.free_energy import precision_weighted_selection

selected_idx = precision_weighted_selection(
    evaluations=[eval_1, eval_2, eval_3],
    precision=0.3  # Low precision → more exploration
)

5. Hierarchical Inference

LRS-Agents implement hierarchical Active Inference with three levels:

Abstract: Long-term goals and strategies
Planning: Action sequences and policies
Execution: Individual tool executions

Each level has its own precision, updated based on prediction errors:

\[\begin{split}\delta_{\text{planning}} &= f(\delta_{\text{execution}}) \\ \delta_{\text{abstract}} &= f(\delta_{\text{planning}})\end{split}\]

where \(f\) applies threshold and attenuation:

\[\begin{split}f(\delta) = \begin{cases} 0 & \text{if } \delta < \theta \\ \alpha \cdot \delta & \text{if } \delta \geq \theta \end{cases}\end{split}\]

This prevents over-reaction to individual tool failures while allowing persistent errors to propagate upward.

from lrs.core.precision import HierarchicalPrecision

hp = HierarchicalPrecision(
    propagation_threshold=0.7,
    attenuation_factor=0.5
)

# High execution error
hp.update('execution', prediction_error=0.95)

# If above threshold, propagates to planning (attenuated)
# planning_error = 0.95 * 0.5 = 0.475

Active Inference vs Other Approaches

Comparison Table

Property	Reinforcement Learning	POMDP	Active Inference
Objective	Maximize reward	Maximize expected value	Minimize free energy
Exploration	Manual (ε-greedy, etc.)	Information value	Epistemic value
Uncertainty	Implicit	Belief state	Precision (explicit)
Adaptation	Fixed learning rate	Bayesian update	Precision-weighted update
Hierarchy	Options framework	Hierarchical POMDP	Hierarchical inference

Advantages of Active Inference

Principled Exploration

Exploration emerges from uncertainty reduction, not random sampling.
Unified Framework

Perception, learning, and action all minimize free energy.
Explicit Uncertainty

Precision tracking enables adaptive behavior.
Hierarchical Compositionality

Natural handling of multi-level planning.
Biological Plausibility

Matches neural mechanisms (predictive coding, precision-weighting).

Theoretical Foundations

Predictive Processing

Active Inference builds on Predictive Processing:

The brain constantly generates predictions
Predictions are compared to sensory input
Prediction errors update beliefs
Actions minimize future prediction errors

Top-down Predictions
       ↓
┌─────────────────┐
│  Sensory Input  │
└────────┬────────┘
         ↓
┌─────────────────┐
│ Prediction Error│ → Update Beliefs
└─────────────────┘

Precision-Weighting

Not all prediction errors are equally important. Precision \(\gamma\) acts as a gain control:

\[\Delta \text{Belief} \propto \gamma \cdot \text{Prediction Error}\]

High \(\gamma\) → Large belief updates (trust observations) Low \(\gamma\) → Small belief updates (trust prior beliefs)

This is equivalent to attention in neuroscience:

High precision = Attend to observations
Low precision = Ignore observations (trust model)

Bayesian Brain Hypothesis

Active Inference aligns with the Bayesian Brain Hypothesis:

The brain performs approximate Bayesian inference to maintain beliefs about the world.

LRS-Agents implement this through:

Prior beliefs (initial precision)
Likelihood (tool reliability statistics)
Posterior (updated precision after observations)

Markov Blanket

Active Inference respects the Markov Blanket - the boundary between agent and environment:

┌─────────────────────────────┐
│         AGENT               │
│  ┌──────────────────────┐   │
│  │   Internal States    │   │
│  │   (Belief State)     │   │
│  └──────────┬───────────┘   │
│             │               │
│  ┌──────────▼───────────┐   │
│  │   Sensory States     │   │ ← Observations
│  └──────────────────────┘   │
│  ┌──────────────────────┐   │
│  │   Active States      │   │ → Actions
│  └──────────────────────┘   │
└─────────────────────────────┘

The Markov Blanket ensures:

Internal states don’t directly access the world
All interactions mediated by sensory/active states

Real-World Applications

Robotics

Active Inference applied to robot control:

Precision-weighted motor control: More precise movements when confident
Exploration of environment: Robots actively seek information
Adaptive grasping: Adjust grip based on prediction errors

Example: A robot arm learning to grasp objects adjusts its grip force based on tactile prediction errors, with precision determining how much to update based on each touch.

Autonomous Vehicles

Self-driving cars use Active Inference principles:

Perception: Update beliefs about road conditions
Action: Steer/brake to minimize surprise
Precision: Higher in clear conditions, lower in fog

Clinical Applications

Understanding mental health through Active Inference:

Anxiety: Persistently low precision (over-sensitivity to prediction errors)
Psychosis: Precision imbalance (hallucinations as over-confident false beliefs)
Autism: Atypical precision-weighting in social contexts

AI Safety

Active Inference provides safety benefits:

Interpretability: Explicit beliefs and uncertainty
Graceful degradation: Adapts when uncertain
Conservative by default: Won’t act confidently without evidence

Limitations and Open Questions

Computational Complexity

Full Active Inference requires:

Sampling over all possible futures
Evaluating free energy for each scenario
Maintaining probability distributions

LRS-Agents addresses this through:

LLM-based proposal mechanisms (variational sampling)
Hierarchical decomposition
Caching and approximations

Model Misspecification

What if the generative model is wrong?

Agent’s beliefs may never converge
Persistent high prediction errors
Solution: Model selection, structural learning

In LRS-Agents:

Tool registry provides alternatives
Adaptation explores different models
Human-in-the-loop for model updates

Precision Learning

How to learn optimal precision parameters?

Current: Hand-tuned asymmetric learning rates
Future: Meta-learning from task distributions
Challenge: Balancing stability vs plasticity

Mathematical Details

Variational Message Passing

LRS-Agents implement approximate inference via variational message passing:

Forward pass: Propagate predictions down hierarchy
Backward pass: Propagate prediction errors up hierarchy
Update: Adjust beliefs to minimize free energy

\[Q(s) \leftarrow \arg\min_Q F[Q(s)]\]

Dynamic Causal Modeling

Tool execution can be seen as Dynamic Causal Modeling:

\[\begin{split}s_{t+1} &= f(s_t, a_t, \theta) + \omega_t \\ o_t &= g(s_t, \phi) + \nu_t\end{split}\]

where:

\(f\) = State transition (tool execution)
\(g\) = Observation function (tool output)
\(\theta, \phi\) = Parameters
\(\omega_t, \nu_t\) = Noise

Prediction errors:

\[\begin{split}\epsilon_s &= s_{t+1} - f(s_t, a_t, \theta) \\ \epsilon_o &= o_t - g(s_t, \phi)\end{split}\]

Generalized Free Energy

For continuous time and generalized coordinates:

\[F = \frac{1}{2} \epsilon' \Pi \epsilon\]

where:

\(\epsilon\) = Prediction error vector
\(\Pi = \text{diag}(\gamma)\) = Precision matrix

Next Steps

Read about Free Energy for detailed G calculation
Understand Precision Dynamics for adaptation mechanisms
See Core Concepts for practical implementation
Explore ../tutorials/02_understanding_precision for hands-on examples

Active Inference

Overview

The Core Principle

Free Energy Minimization

Two Ways to Minimize Free Energy

Active Inference for AI Agents

Traditional RL vs Active Inference

Key Insight

Mathematical Framework

Generative Model

Variational Free Energy

Expected Free Energy

Precision-Weighted Beliefs

How LRS-Agents Implements Active Inference

1. Generative Model

2. Precision Tracking

3. Expected Free Energy Calculation

4. Policy Selection

5. Hierarchical Inference

Active Inference vs Other Approaches

Comparison Table

Advantages of Active Inference

Theoretical Foundations

Predictive Processing

Precision-Weighting

Bayesian Brain Hypothesis

Markov Blanket

Real-World Applications

Robotics

Autonomous Vehicles

Clinical Applications

AI Safety

Limitations and Open Questions

Computational Complexity

Model Misspecification

Precision Learning

Mathematical Details

Variational Message Passing

Dynamic Causal Modeling

Generalized Free Energy

Further Reading

Foundational Papers

Books

Implementations

Next Steps