Inference API
The inference module provides components for LLM-based policy generation and evaluation.
Meta-Cognitive Prompting
Meta-cognitive prompting for LRS-Agents.
Generates precision-adaptive prompts that guide LLMs to produce diverse policy proposals appropriate to the agent’s epistemic state.
- class lrs.inference.prompts.StrategyMode(value)[source]
Bases:
EnumStrategic mode based on precision level
- EXPLOITATION = 'exploit'
- EXPLORATION = 'explore'
- BALANCED = 'balanced'
- class lrs.inference.prompts.PromptContext(precision: float, recent_errors: List[float], available_tools: List[str], goal: str, state: Dict[str, Any], tool_history: List[Dict[str, Any]])[source]
Bases:
objectContext for generating meta-cognitive prompts.
- class lrs.inference.prompts.MetaCognitivePrompter(high_precision_threshold: float = 0.7, low_precision_threshold: float = 0.4, high_error_threshold: float = 0.7)[source]
Bases:
objectGenerates precision-adaptive prompts for LLM policy generation.
The prompts adapt based on: 1. Precision level (confidence in world model) 2. Recent prediction errors (surprise events) 3. Available tools 4. Current goal
Examples
>>> prompter = MetaCognitivePrompter() >>> >>> context = PromptContext( ... precision=0.3, # Low precision ... recent_errors=[0.9, 0.85, 0.7], ... available_tools=["api_fetch", "cache_fetch"], ... goal="Fetch user data", ... state={}, ... tool_history=[] ... ) >>> >>> prompt = prompter.generate_prompt(context) >>> print("EXPLORATION MODE" in prompt) True
- __init__(high_precision_threshold: float = 0.7, low_precision_threshold: float = 0.4, high_error_threshold: float = 0.7)[source]
Initialize prompter.
- Parameters:
high_precision_threshold – Threshold for exploitation mode
low_precision_threshold – Threshold for exploration mode
high_error_threshold – Threshold for “high surprise”
- generate_prompt(context: PromptContext) str[source]
Generate precision-adaptive prompt.
- Parameters:
context – Prompt context with precision, errors, tools, etc.
- Returns:
Complete prompt string for LLM
Examples
>>> prompt = prompter.generate_prompt(context) >>> # Prompt includes precision value, strategy guidance, tool list
- lrs.inference.prompts.build_simple_prompt(goal: str, tools: List[str], precision: float, num_proposals: int = 5) str[source]
Build a simple prompt without full context.
Convenience function for quick prompting.
- Parameters:
goal – Task goal
tools – Available tool names
precision – Current precision value
num_proposals – Number of proposals to generate
- Returns:
Prompt string
Examples
>>> prompt = build_simple_prompt( ... goal="Fetch data", ... tools=["api", "cache"], ... precision=0.5 ... )
Classes
- class lrs.inference.prompts.PromptContext(precision: float, recent_errors: List[float], available_tools: List[str], goal: str, state: Dict[str, Any], tool_history: List[Dict[str, Any]])[source]
Bases:
objectContext for generating meta-cognitive prompts.
Context for generating meta-cognitive prompts.
Attributes:
precision (float): Current precision value
recent_errors (List[float]): Recent prediction errors
available_tools (List[str]): Tools the agent can use
goal (str): Current task goal
state (dict): Current belief state
tool_history (List[dict]): Execution history
- class lrs.inference.prompts.StrategyMode(value)[source]
Bases:
EnumStrategic mode based on precision level
Policy generation strategy based on precision.
EXPLOITATION: High precision → Prioritize reward
EXPLORATION: Low precision → Prioritize information gain
BALANCED: Medium precision → Balance both
- EXPLOITATION = 'exploit'
- EXPLORATION = 'explore'
- BALANCED = 'balanced'
- class lrs.inference.prompts.MetaCognitivePrompter(high_precision_threshold: float = 0.7, low_precision_threshold: float = 0.4, high_error_threshold: float = 0.7)[source]
Bases:
objectGenerates precision-adaptive prompts for LLM policy generation.
The prompts adapt based on: 1. Precision level (confidence in world model) 2. Recent prediction errors (surprise events) 3. Available tools 4. Current goal
Examples
>>> prompter = MetaCognitivePrompter() >>> >>> context = PromptContext( ... precision=0.3, # Low precision ... recent_errors=[0.9, 0.85, 0.7], ... available_tools=["api_fetch", "cache_fetch"], ... goal="Fetch user data", ... state={}, ... tool_history=[] ... ) >>> >>> prompt = prompter.generate_prompt(context) >>> print("EXPLORATION MODE" in prompt) True
Generates precision-adaptive prompts for LLM policy generation.
Methods:
- generate_prompt(context: PromptContext) str[source]
Generate precision-adaptive prompt.
- Parameters:
context – Prompt context with precision, errors, tools, etc.
- Returns:
Complete prompt string for LLM
Examples
>>> prompt = prompter.generate_prompt(context) >>> # Prompt includes precision value, strategy guidance, tool list
Example:
from lrs.inference.prompts import MetaCognitivePrompter, PromptContext prompter = MetaCognitivePrompter() context = PromptContext( precision=0.3, # Low precision recent_errors=[0.8, 0.9], available_tools=['api', 'cache', 'db'], goal='Fetch user data', state={}, tool_history=[] ) prompt = prompter.generate_prompt(context) # Generates exploration-focused prompt
- __init__(high_precision_threshold: float = 0.7, low_precision_threshold: float = 0.4, high_error_threshold: float = 0.7)[source]
Initialize prompter.
- Parameters:
high_precision_threshold – Threshold for exploitation mode
low_precision_threshold – Threshold for exploration mode
high_error_threshold – Threshold for “high surprise”
- generate_prompt(context: PromptContext) str[source]
Generate precision-adaptive prompt.
- Parameters:
context – Prompt context with precision, errors, tools, etc.
- Returns:
Complete prompt string for LLM
Examples
>>> prompt = prompter.generate_prompt(context) >>> # Prompt includes precision value, strategy guidance, tool list
LLM Policy Generator
LLM-based policy generation for Active Inference.
- class lrs.inference.llm_policy_generator.PolicyProposal(*, tool_sequence: ~typing.List[str], reasoning: str, estimated_success_prob: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], estimated_info_gain: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], strategy: str, failure_modes: ~typing.List[str] = <factory>)[source]
Bases:
BaseModelA single policy proposal with metadata.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lrs.inference.llm_policy_generator.PolicyProposalSet(*, proposals: ~typing.List[~lrs.inference.llm_policy_generator.PolicyProposal], current_uncertainty: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], known_unknowns: ~typing.List[str] = <factory>)[source]
Bases:
BaseModelComplete set of policy proposals with metadata.
- proposals: List[PolicyProposal]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lrs.inference.llm_policy_generator.LLMPolicyGenerator(llm: BaseChatModel, registry: ToolRegistry, prompter: MetaCognitivePrompter | None = None)[source]
Bases:
objectGenerates policy proposals using an LLM with Active Inference principles.
The generator uses meta-cognitive prompting to produce diverse policies that balance exploration and exploitation based on precision parameters.
- __init__(llm: BaseChatModel, registry: ToolRegistry, prompter: MetaCognitivePrompter | None = None)[source]
Initialize the policy generator.
- Parameters:
llm – Language model for generating proposals
registry – Tool registry for available actions
prompter – Optional custom prompter (creates default if None)
- generate_proposals(state: Dict[str, Any] | None = None, precision: PrecisionParameters | None = None, num_proposals: int = 3) List[Dict[str, Any]][source]
Generate policy proposals based on current context and precision.
- Parameters:
state – Current state, goal, and history (deprecated, use context instead)
context – Current state, goal, and history
precision – Precision parameters guiding exploration/exploitation
num_proposals – Number of proposals to generate
- Returns:
List of policy dictionaries with tools and metadata
- lrs.inference.llm_policy_generator.create_mock_generator(num_proposals: int = 3) LLMPolicyGenerator[source]
Create a mock policy generator for testing.
- Parameters:
num_proposals – Number of proposals the mock should generate
- Returns:
Generator that produces simple test proposals.
Classes
- class lrs.inference.llm_policy_generator.PolicyProposal(*, tool_sequence: ~typing.List[str], reasoning: str, estimated_success_prob: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], estimated_info_gain: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], strategy: str, failure_modes: ~typing.List[str] = <factory>)[source]
Bases:
BaseModelA single policy proposal with metadata.
Single policy proposal from LLM.
Attributes:
policy_id (int): Unique identifier
tools (List[str]): Tool names in sequence
estimated_success_prob (float): LLM’s self-assessed success probability
expected_information_gain (float): Expected epistemic value
strategy (str): “exploit”, “explore”, or “balanced”
rationale (str): Explanation of policy
failure_modes (List[str]): Potential failure scenarios
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lrs.inference.llm_policy_generator.PolicyProposalSet(*, proposals: ~typing.List[~lrs.inference.llm_policy_generator.PolicyProposal], current_uncertainty: ~typing.Annotated[float, ~annotated_types.Ge(ge=0.0), ~annotated_types.Le(le=1.0)], known_unknowns: ~typing.List[str] = <factory>)[source]
Bases:
BaseModelComplete set of policy proposals with metadata.
Set of 3-7 policy proposals from LLM.
Attributes:
proposals (List[PolicyProposal]): Individual proposals
current_uncertainty (Optional[float]): LLM’s uncertainty estimate
known_unknowns (List[str]): What the LLM doesn’t know
- proposals: List[PolicyProposal]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class lrs.inference.llm_policy_generator.LLMPolicyGenerator(llm: BaseChatModel, registry: ToolRegistry, prompter: MetaCognitivePrompter | None = None)[source]
Bases:
objectGenerates policy proposals using an LLM with Active Inference principles.
The generator uses meta-cognitive prompting to produce diverse policies that balance exploration and exploitation based on precision parameters.
LLM-based variational proposal mechanism.
Methods:
- generate_proposals(state: Dict[str, Any] | None = None, precision: PrecisionParameters | None = None, num_proposals: int = 3) List[Dict[str, Any]][source]
Generate policy proposals based on current context and precision.
- Parameters:
state – Current state, goal, and history (deprecated, use context instead)
context – Current state, goal, and history
precision – Precision parameters guiding exploration/exploitation
num_proposals – Number of proposals to generate
- Returns:
List of policy dictionaries with tools and metadata
Temperature Adaptation:
Temperature scales with precision:
\[T = T_{base} \times \frac{1}{\gamma + 0.1}\]Low precision → High temperature → Diverse proposals
Example:
from lrs.inference.llm_policy_generator import LLMPolicyGenerator from langchain_anthropic import ChatAnthropic llm = ChatAnthropic(model="claude-sonnet-4-20250514") generator = LLMPolicyGenerator(llm, registry) proposals = generator.generate_proposals( state={'goal': 'Fetch data'}, precision=0.5, num_proposals=5 ) for proposal in proposals: print(f"{proposal['strategy']}: {proposal['tool_names']}")
- __init__(llm: BaseChatModel, registry: ToolRegistry, prompter: MetaCognitivePrompter | None = None)[source]
Initialize the policy generator.
- Parameters:
llm – Language model for generating proposals
registry – Tool registry for available actions
prompter – Optional custom prompter (creates default if None)
- generate_proposals(state: Dict[str, Any] | None = None, precision: PrecisionParameters | None = None, num_proposals: int = 3) List[Dict[str, Any]][source]
Generate policy proposals based on current context and precision.
- Parameters:
state – Current state, goal, and history (deprecated, use context instead)
context – Current state, goal, and history
precision – Precision parameters guiding exploration/exploitation
num_proposals – Number of proposals to generate
- Returns:
List of policy dictionaries with tools and metadata
Functions
- lrs.inference.llm_policy_generator.create_mock_generator(num_proposals: int = 3) LLMPolicyGenerator[source]
Create a mock policy generator for testing.
- Parameters:
num_proposals – Number of proposals the mock should generate
- Returns:
Generator that produces simple test proposals.
Create mock generator for testing (doesn’t require API key).
Hybrid Evaluator
Hybrid Expected Free Energy evaluation.
- class lrs.inference.evaluator.HybridGEvaluator(lambda_fn: callable | None = None, epistemic_weight: float = 1.0)[source]
Bases:
objectEvaluate policies using both LLM priors and mathematical statistics.
G_hybrid = (1 - λ) * G_math + λ * G_llm
Where: - G_math: Calculated from historical execution statistics - G_llm: Derived from LLM’s self-assessed success prob and info gain - λ: Interpolation factor (adaptive based on precision)
Intuition: - Low precision → trust LLM more (world model unreliable, use semantics) - High precision → trust math more (world model accurate, use statistics)
Examples
>>> evaluator = HybridGEvaluator() >>> >>> # LLM proposal with self-assessment >>> proposal = { ... 'policy': [tool_a, tool_b], ... 'llm_success_prob': 0.7, ... 'llm_info_gain': 0.4 ... } >>> >>> # Evaluate with hybrid approach >>> G = evaluator.evaluate_hybrid( ... proposal, state, preferences, precision=0.5 ... )
- __init__(lambda_fn: callable | None = None, epistemic_weight: float = 1.0)[source]
Initialize hybrid evaluator.
- Parameters:
lambda_fn – Function mapping precision → interpolation weight Default: λ = 1 - precision (trust LLM when uncertain)
epistemic_weight – Weight for epistemic value in G calculation
- evaluate_hybrid(proposal: Dict[str, Any], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) float[source]
Evaluate policy using hybrid approach.
- Parameters:
proposal – Policy proposal with ‘policy’, ‘llm_success_prob’, ‘llm_info_gain’
state – Current agent state
preferences – Reward function
precision – Current precision value
historical_stats – Optional execution history
- Returns:
Hybrid G value
Examples
>>> G = evaluator.evaluate_hybrid(proposal, state, preferences, precision=0.3) >>> # Low precision → G weighted toward LLM's assessment
- evaluate_all(proposals: List[Dict[str, Any]], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) List[PolicyEvaluation][source]
Evaluate multiple proposals.
- Parameters:
proposals – List of policy proposals
state – Current state
preferences – Reward function
precision – Current precision
historical_stats – Execution history
- Returns:
List of PolicyEvaluation objects
- lrs.inference.evaluator.compare_math_vs_llm(proposal: Dict[str, Any], state: Dict[str, Any], preferences: Dict[str, float], historical_stats: Dict[str, Dict] | None = None) Dict[str, float][source]
Compare mathematical vs LLM-based G calculation.
Useful for debugging and understanding how the hybrid evaluator works.
- Parameters:
proposal – Policy proposal with LLM assessments
state – Current state
preferences – Reward function
historical_stats – Execution history
- Returns:
Dict with ‘G_math’, ‘G_llm’, and ‘difference’
Examples
>>> comparison = compare_math_vs_llm(proposal, state, preferences) >>> print(f"Math G: {comparison['G_math']:.2f}") >>> print(f"LLM G: {comparison['G_llm']:.2f}") >>> print(f"Difference: {comparison['difference']:.2f}")
Classes
- class lrs.inference.evaluator.HybridGEvaluator(lambda_fn: callable | None = None, epistemic_weight: float = 1.0)[source]
Bases:
objectEvaluate policies using both LLM priors and mathematical statistics.
G_hybrid = (1 - λ) * G_math + λ * G_llm
Where: - G_math: Calculated from historical execution statistics - G_llm: Derived from LLM’s self-assessed success prob and info gain - λ: Interpolation factor (adaptive based on precision)
Intuition: - Low precision → trust LLM more (world model unreliable, use semantics) - High precision → trust math more (world model accurate, use statistics)
Examples
>>> evaluator = HybridGEvaluator() >>> >>> # LLM proposal with self-assessment >>> proposal = { ... 'policy': [tool_a, tool_b], ... 'llm_success_prob': 0.7, ... 'llm_info_gain': 0.4 ... } >>> >>> # Evaluate with hybrid approach >>> G = evaluator.evaluate_hybrid( ... proposal, state, preferences, precision=0.5 ... )
Hybrid evaluator combining mathematical G with LLM self-assessment.
Formula:
\[G_{hybrid} = (1 - \lambda) \cdot G_{math} + \lambda \cdot G_{llm}\]where \(\lambda = 1 - \gamma\) (low precision → trust LLM more)
Methods:
- evaluate_hybrid(proposal: Dict[str, Any], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) float[source]
Evaluate policy using hybrid approach.
- Parameters:
proposal – Policy proposal with ‘policy’, ‘llm_success_prob’, ‘llm_info_gain’
state – Current agent state
preferences – Reward function
precision – Current precision value
historical_stats – Optional execution history
- Returns:
Hybrid G value
Examples
>>> G = evaluator.evaluate_hybrid(proposal, state, preferences, precision=0.3) >>> # Low precision → G weighted toward LLM's assessment
- evaluate_all(proposals: List[Dict[str, Any]], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) List[PolicyEvaluation][source]
Evaluate multiple proposals.
- Parameters:
proposals – List of policy proposals
state – Current state
preferences – Reward function
precision – Current precision
historical_stats – Execution history
- Returns:
List of PolicyEvaluation objects
Example:
from lrs.inference.evaluator import HybridGEvaluator evaluator = HybridGEvaluator() # Evaluate single proposal eval_result = evaluator.evaluate_hybrid( proposal=proposal_dict, state={}, preferences={'success': 5.0}, precision=0.5, historical_stats=registry.statistics ) print(f"G_hybrid: {eval_result.total_G}") print(f"G_math: {eval_result.components['G_math']}") print(f"G_llm: {eval_result.components['G_llm']}") print(f"λ: {eval_result.components['lambda']}")
- __init__(lambda_fn: callable | None = None, epistemic_weight: float = 1.0)[source]
Initialize hybrid evaluator.
- Parameters:
lambda_fn – Function mapping precision → interpolation weight Default: λ = 1 - precision (trust LLM when uncertain)
epistemic_weight – Weight for epistemic value in G calculation
- evaluate_hybrid(proposal: Dict[str, Any], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) float[source]
Evaluate policy using hybrid approach.
- Parameters:
proposal – Policy proposal with ‘policy’, ‘llm_success_prob’, ‘llm_info_gain’
state – Current agent state
preferences – Reward function
precision – Current precision value
historical_stats – Optional execution history
- Returns:
Hybrid G value
Examples
>>> G = evaluator.evaluate_hybrid(proposal, state, preferences, precision=0.3) >>> # Low precision → G weighted toward LLM's assessment
- evaluate_all(proposals: List[Dict[str, Any]], state: Dict[str, Any], preferences: Dict[str, float], precision: float, historical_stats: Dict[str, Dict] | None = None) List[PolicyEvaluation][source]
Evaluate multiple proposals.
- Parameters:
proposals – List of policy proposals
state – Current state
preferences – Reward function
precision – Current precision
historical_stats – Execution history
- Returns:
List of PolicyEvaluation objects
Functions
- lrs.inference.evaluator.compare_math_vs_llm(proposal: Dict[str, Any], state: Dict[str, Any], preferences: Dict[str, float], historical_stats: Dict[str, Dict] | None = None) Dict[str, float][source]
Compare mathematical vs LLM-based G calculation.
Useful for debugging and understanding how the hybrid evaluator works.
- Parameters:
proposal – Policy proposal with LLM assessments
state – Current state
preferences – Reward function
historical_stats – Execution history
- Returns:
Dict with ‘G_math’, ‘G_llm’, and ‘difference’
Examples
>>> comparison = compare_math_vs_llm(proposal, state, preferences) >>> print(f"Math G: {comparison['G_math']:.2f}") >>> print(f"LLM G: {comparison['G_llm']:.2f}") >>> print(f"Difference: {comparison['difference']:.2f}")
Debug utility to compare mathematical vs LLM G values.