Source Code

The runnable TypeScript source for this lesson is in lessons/25-production-agent/

Lesson 25: Production Agent (Capstone)¶

The capstone lesson - a real, usable coding agent that integrates ALL previous lessons into one powerful tool.

Table of Contents¶

Overview
Quick Start
Architecture
Configuration
Features by Lesson
Providers
Tools
REPL Commands
MCP Integration
Programmatic Usage
Events
Agent Modes
Built-in Roles
Tricks Integration
Testing
Extending
CLI Options
Dependencies

1. Overview¶

What This Agent Does¶

A production-ready coding assistant that combines 24 lessons of AI agent patterns into a cohesive system. It can:

Read, write, and modify code files
Execute shell commands with sandboxing
Search the web and fetch URLs
Remember previous interactions
Plan complex tasks automatically
Use explicit reasoning (ReAct pattern)
Coordinate multiple specialized agents
Manage conversation threads and checkpoints

Key Differentiators¶

Feature	Simple Agent	Production Agent
Memory	None	Episodic + Semantic + Working
Planning	None	Auto-plan for complex tasks
Safety	Basic	Sandboxing + Human-in-loop + Policies
Context	Fixed	Compaction + Sliding window
Tools	Static	Lazy-loading + MCP integration
Debugging	None	Observability + Thread management

Feature Flags Summary¶

All features are configurable. By default:

Feature	Default	Purpose
`hooks`	Enabled	Lifecycle events
`plugins`	Enabled	Extensibility
`rules`	Enabled	Dynamic instructions
`memory`	Enabled	Conversation memory
`planning`	Enabled	Task decomposition
`reflection`	Enabled	Self-critique
`observability`	Enabled	Tracing/metrics
`routing`	Disabled	Multi-model routing
`sandbox`	Enabled	Command safety
`humanInLoop`	Enabled	Approval workflows
`multiAgent`	Disabled	Team coordination
`react`	Disabled	Explicit reasoning
`executionPolicy`	Enabled	Tool access control
`threads`	Enabled	Checkpoints/rollback
`cancellation`	Enabled	Graceful interruption
`resources`	Enabled	Memory/CPU limits
`lsp`	Disabled	Code intelligence
`semanticCache`	Disabled	Response caching
`skills`	Enabled	Skill discovery

2. Quick Start¶

Prerequisites¶

Node.js 18+ or Bun
An LLM API key (OpenRouter, Anthropic, or OpenAI)

Installation¶

# Clone and install
git clone <repo>
cd first-principles-agent
npm install

# Or with bun
bun install

Environment Setup¶

# Option 1: OpenRouter (recommended - supports multiple models)
export OPENROUTER_API_KEY=your-key-here

# Option 2: Anthropic
export ANTHROPIC_API_KEY=your-key-here

# Option 3: OpenAI
export OPENAI_API_KEY=your-key-here

Running the Agent¶

# Interactive mode
npm run lesson:25

# With specific model
npx tsx 25-production-agent/main.ts -m anthropic/claude-sonnet-4

# Single task
npx tsx 25-production-agent/main.ts "List all TypeScript files in src/"

# With strict permissions
npx tsx 25-production-agent/main.ts -p strict

3. Architecture¶

Component Diagram¶

+-----------------------------------------------------------------------+
|                         ProductionAgent                                  |
+-----------------------------------------------------------------------+
|  +-------------+  +-------------+  +-------------+  +-------------+    |
|  |   Memory    |  |  Planning   |  |  Reflection |  |    Rules    |    |
|  |  (L14)      |  |   (L15)     |  |    (L16)    |  |    (L12)    |    |
|  +-------------+  +-------------+  +-------------+  +-------------+    |
|  +-------------+  +-------------+  +-------------+  +-------------+    |
|  | Multi-Agent |  |    ReAct    |  |  Policies   |  |   Threads   |    |
|  |   (L17)     |  |    (L18)    |  |    (L23)    |  |    (L24)    |    |
|  +-------------+  +-------------+  +-------------+  +-------------+    |
+-----------------------------------------------------------------------+
|  +---------------------------------------------------------------+    |
|  |                    Integration Layer                            |    |
|  |  +---------+ +---------+ +---------+ +---------+ +---------+  |    |
|  |  |Economics| | Session | | Skills  | |  Ignore | |   LSP   |  |    |
|  |  +---------+ +---------+ +---------+ +---------+ +---------+  |    |
|  |  +---------+ +---------+ +---------+ +---------+ +---------+  |    |
|  |  | MCP     | |Compaction| | Modes  | | Agents  | |PTY Shell|  |    |
|  |  +---------+ +---------+ +---------+ +---------+ +---------+  |    |
|  +---------------------------------------------------------------+    |
+-----------------------------------------------------------------------+
|  +-------------+  +-------------+  +-------------+  +-------------+    |
|  |  Sandbox    |  | Human Loop  |  |Observability|  |  Routing    |    |
|  |   (L20)     |  |    (L21)    |  |    (L19)    |  |    (L22)    |    |
|  +-------------+  +-------------+  +-------------+  +-------------+    |
+-----------------------------------------------------------------------+
|  +---------------------------------------------------------------+    |
|  |                    Provider Layer                               |    |
|  |  +----------+ +----------+ +----------+ +----------+           |    |
|  |  |OpenRouter| |Anthropic | |  OpenAI  | |   Mock   |           |    |
|  |  +----------+ +----------+ +----------+ +----------+           |    |
|  +---------------------------------------------------------------+    |
+-----------------------------------------------------------------------+

File Structure¶

25-production-agent/
├── main.ts              # Interactive REPL + CLI entry point
├── agent.ts             # ProductionAgent class with builder pattern
├── types.ts             # Type definitions for all features
├── defaults.ts          # Default configurations for all features
├── modes.ts             # Agent modes (build/plan/review/debug)
├── providers.ts         # LLM provider implementations
├── tools.ts             # Built-in tool definitions
└── integrations/
    ├── hooks.ts         # Hook system (L10)
    ├── plugins.ts       # Plugin system (L11)
    ├── rules.ts         # Rules/instructions (L12)
    ├── memory.ts        # Episodic + semantic + working (L14)
    ├── planning.ts      # Task decomposition (L15)
    ├── reflection.ts    # Self-critique (L16)
    ├── multi-agent.ts   # Team coordination (L17)
    ├── react.ts         # ReAct pattern (L18)
    ├── observability.ts # Tracing + metrics (L19)
    ├── execution-policy.ts  # Tool access control (L23)
    ├── thread-manager.ts    # Checkpoints + rollback (L24)
    ├── economics.ts     # Token budgets + progress detection
    ├── session-store.ts # JSONL persistence
    ├── skills.ts        # Skill discovery
    ├── ignore.ts        # .agentignore support
    ├── mcp-tool-search.ts   # Lazy MCP tool loading
    ├── agent-registry.ts    # Subagent spawning
    ├── compaction.ts    # Context summarization
    ├── hierarchical-config.ts # Config layering
    ├── lsp.ts           # Language server integration
    ├── semantic-cache.ts    # Response caching
    ├── cancellation.ts  # Graceful interruption
    ├── resources.ts     # Resource monitoring
    ├── pty-shell.ts     # Persistent shell
    └── sandbox/
        ├── index.ts     # Sandbox factory
        ├── seatbelt.ts  # macOS sandbox
        ├── landlock.ts  # Linux sandbox
        ├── docker.ts    # Container sandbox
        └── basic.ts     # Fallback sandbox

4. Configuration¶

Full Configuration Interface¶

interface ProductionAgentConfig {
  // Required
  provider: LLMProvider;
  tools: ToolDefinition[];

  // Optional - all have sensible defaults
  systemPrompt?: string;
  model?: string;

  // Feature configs (set to false to disable)
  hooks?: HooksConfig | false;
  plugins?: PluginsConfig | false;
  rules?: RulesConfig | false;
  memory?: MemoryConfig | false;
  planning?: PlanningConfig | false;
  reflection?: ReflectionConfig | false;
  observability?: ObservabilityConfig | false;
  routing?: RoutingConfig | false;
  sandbox?: SandboxConfig | false;
  humanInLoop?: HumanInLoopConfig | false;
  multiAgent?: MultiAgentConfig | false;
  react?: ReActPatternConfig | false;
  executionPolicy?: ExecutionPolicyConfig | false;
  threads?: ThreadsConfig | false;
  cancellation?: CancellationConfig | false;
  resources?: ResourceConfig | false;
  lsp?: LSPAgentConfig | false;
  semanticCache?: SemanticCacheAgentConfig | false;
  skills?: SkillsAgentConfig | false;

  // Limits
  maxIterations?: number;  // Default: 50
  timeout?: number;        // Default: 300000 (5 min)

  // MCP lazy loading
  toolResolver?: (toolName: string) => ToolDefinition | null;
  mcpToolSummaries?: Array<{ name: string; description: string }>;
}

5. Features by Lesson¶

Lesson	Feature	Integration File	Purpose
1-9	Core Agent	`agent.ts`	Basic agent loop
10	Hooks	`hooks.ts`	Lifecycle events
11	Plugins	`plugins.ts`	Extensibility
12	Rules	`rules.ts`	Dynamic instructions
13	Client/Server	`session-store.ts`	JSONL persistence
14	Memory	`memory.ts`	Episodic + semantic
15	Planning	`planning.ts`	Task decomposition
16	Reflection	`reflection.ts`	Self-critique
17	Multi-Agent	`multi-agent.ts`, `agent-registry.ts`	Team coordination
18	ReAct	`react.ts`	Explicit reasoning
19	Observability	`observability.ts`, `economics.ts`	Tracing + budgets
20	Sandboxing	`sandbox/*.ts`, `pty-shell.ts`	Secure execution
21	Human-in-Loop	`safety.ts`	Approval workflows
22	Routing	`routing.ts`	Multi-model
23	Policies	`execution-policy.ts`	Access control
24	Threads	`thread-manager.ts`	Checkpoints
Tricks	Various	See section 14	Utilities

6. Providers¶

OpenRouter (Primary)¶

import { createOpenRouterProvider } from './providers.js';

const provider = createOpenRouterProvider({
  apiKey: process.env.OPENROUTER_API_KEY,
  defaultModel: 'anthropic/claude-sonnet-4',
});

Anthropic¶

import { createAnthropicProvider } from './providers.js';

const provider = createAnthropicProvider({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

OpenAI¶

import { createOpenAIProvider } from './providers.js';

const provider = createOpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY,
});

Mock (for Testing)¶

import { createMockProvider } from './providers.js';

const provider = createMockProvider({
  responses: [
    { content: 'Hello!', toolCalls: [] },
  ],
});

7. Tools¶

Standard Tools¶

Tool	Description	Danger Level
`read_file`	Read file contents	safe
`write_file`	Write/create files	dangerous
`edit_file`	Edit existing files	dangerous
`list_directory`	List directory contents	safe
`search_files`	Search files by pattern	safe
`bash`	Execute shell commands	dangerous
`web_search`	Search the web	moderate
`fetch_url`	Fetch URL contents	moderate

MCP Meta-Tools¶

When MCP is enabled with lazy loading:

Tool	Description
`mcp_tool_search`	Search and load MCP tools
`mcp_tool_list`	List available MCP tools
`mcp_context_stats`	Show token usage stats

8. REPL Commands¶

Session Management¶

Command	Description
`/new`	Start new session
`/save [name]`	Save current session
`/load <id>`	Load saved session
`/sessions`	List all sessions
`/history`	Show message history
`/clear`	Clear current context

Agent Control¶

Command	Description
`/mode [mode]`	Switch mode (build/plan/review/debug)
`/cancel`	Cancel current operation
`/checkpoint [label]`	Create checkpoint
`/rollback [steps]`	Rollback messages
`/fork [name]`	Fork conversation

Information¶

Command	Description
`/status`	Show session stats
`/memory`	Show memory contents
`/tools`	List available tools
`/config`	Show current config
`/help`	Show help

Skills¶

Command	Description
`/skills`	List available skills
`/skill <name>`	Activate a skill

Planning¶

Command	Description
`/plan <goal>`	Create explicit plan
`/plan.status`	Show plan progress

Debugging¶

Command	Description
`/trace`	Show trace for last operation
`/debug`	Toggle debug mode
`/verbose`	Toggle verbose output

9. MCP Integration¶

Lazy Loading¶

MCP tools are loaded on-demand to save context:

Without lazy loading: ~15,000 tokens for 50 tools
With lazy loading:    ~2,500 tokens (summaries only)
Savings:              ~83%

The agent uses meta-tools to discover what it needs: 1. mcp_tool_list - see available tools 2. mcp_tool_search - find and load specific tools 3. Full schemas loaded only when needed

10. Programmatic Usage¶

Basic Usage¶

import { buildAgent } from './agent.js';
import { createOpenRouterProvider } from './providers.js';
import { createStandardTools } from './tools.js';

const agent = buildAgent()
  .provider(createOpenRouterProvider())
  .tools(createStandardTools())
  .build();

const result = await agent.run('List all TypeScript files');
console.log(result.response);

Builder Pattern¶

const agent = buildAgent()
  .provider(myProvider)
  .tools(myTools)
  // Core features
  .memory({ enabled: true })
  .planning({ enabled: true, autoplan: true })
  .observability({ enabled: true })
  // Multi-agent (Lesson 17)
  .multiAgent({ enabled: true, consensusStrategy: 'voting' })
  .addRole(CODER_ROLE)
  .addRole(REVIEWER_ROLE)
  // ReAct (Lesson 18)
  .react({ enabled: true, maxSteps: 15 })
  // Execution Policies (Lesson 23)
  .executionPolicy({
    enabled: true,
    defaultPolicy: 'prompt',
    intentAware: true,
  })
  // Thread Management (Lesson 24)
  .threads({
    enabled: true,
    autoCheckpoint: true,
  })
  .build();

11. Events¶

agent.subscribe(event => {
  switch (event.type) {
    // Core events
    case 'start':          // Task started
    case 'llm.start':      // LLM call starting
    case 'llm.chunk':      // Streaming chunk
    case 'llm.complete':   // LLM call done
    case 'tool.start':     // Tool execution starting
    case 'tool.complete':  // Tool execution done
    case 'complete':       // Task finished

    // ReAct events (Lesson 18)
    case 'react.thought':       // Reasoning step
    case 'react.action':        // Action being taken
    case 'react.observation':   // Result observed

    // Multi-agent events (Lesson 17)
    case 'multiagent.spawn':    // Agent spawned
    case 'multiagent.complete': // Agent finished
    case 'consensus.reached':   // Team decision

    // Policy events (Lesson 23)
    case 'policy.evaluated':    // Access decision
    case 'intent.classified':   // Intent detected
    case 'grant.created':       // Permission granted

    // Thread events (Lesson 24)
    case 'thread.forked':       // Branch created
    case 'checkpoint.created':  // State saved
    case 'checkpoint.restored': // State restored
    case 'rollback':            // Messages removed

    // Mode events
    case 'mode.changed':        // Mode switched
  }
});

12. Agent Modes¶

Switch between operational modes for safety and focus:

Mode	Tools	Purpose
`build`	All	Full access to modify files
`plan`	Read-only	Exploration and planning
`review`	Read-only	Code review focus
`debug`	Read + Test	Debugging with diagnostics

13. Built-in Roles¶

For multi-agent coordination (Lesson 17):

Role	Capabilities	Authority	Model
`researcher`	explore, search, find	5	fast
`coder`	write, implement, fix	8	balanced
`reviewer`	review, check, audit	7	quality
`architect`	design, plan, structure	9	quality
`debugger`	debug, trace, diagnose	6	balanced
`documenter`	document, explain	4	fast

14. Tricks Integration¶

The following tricks are integrated into the production agent:

Trick	Production Module	Status
A: Structured Output	Inline JSON extraction	Enhanced
B: Token Counter	`economics.ts`	Enhanced
C: Prompt Templates	`rules.ts`	Simplified
D: Tool Batching	`agent.ts`	Enhanced
E: Context Sliding	`compaction.ts`	Enhanced
F: Semantic Cache	`semantic-cache.ts`	Enhanced
G: Rate Limiter	Error handling	Embedded
H: Branching	`thread-manager.ts`	Enhanced
I: File Watcher	Not used	Extension point
J: LSP Client	`lsp.ts`	Enhanced
K: Cancellation	`cancellation.ts`	Enhanced
L: Sortable IDs	ID generation	Inline
M: Thread Manager	`thread-manager.ts`	Enhanced
N: Resource Monitor	`resources.ts`	Enhanced
O: JSON Utils	Tool parsing	Inline

15. Testing¶

# All tests
npm test

# Production agent tests only
npm test -- --grep "Lesson 25"

# Watch mode
npm test -- --watch

16. Extending¶

Custom Tools¶

const myTool: ToolDefinition = {
  name: 'my_tool',
  description: 'Does something useful',
  parameters: {
    type: 'object',
    properties: {
      input: { type: 'string', description: 'Input value' },
    },
    required: ['input'],
  },
  dangerLevel: 'safe',
  execute: async ({ input }) => {
    return `Processed: ${input}`;
  },
};

agent.registerTool(myTool);

Custom Agents¶

Create agents in .agents/ directory:

# .agents/security-reviewer.yaml
name: security-reviewer
description: Reviews code for security vulnerabilities
systemPrompt: |
  You are a security expert. Focus on:
  - Injection vulnerabilities
  - Authentication issues
  - Data exposure
tools: [read_file, grep, glob]
model: quality
capabilities: [security, vulnerability, audit]

Custom Skills¶

Create skills in .skills/ directory:

---
name: code-review
description: Detailed code review workflow
triggers: ["review this code", "check for issues"]
tags: [review, quality]
---

# Code Review Skill

When reviewing code, follow this approach:

1. Security Analysis
2. Code Quality
3. Performance

17. CLI Options¶

Option	Description	Default
`-m, --model <model>`	Model to use	`anthropic/claude-sonnet-4`
`-p, --permission <mode>`	Permission mode	`interactive`
`-i, --max-iterations <n>`	Max iterations	`50`
`-t, --task <task>`	Single task mode	-
`--mode <mode>`	Initial mode	`build`
`--no-memory`	Disable memory	-
`--no-sandbox`	Disable sandbox	-
`--debug`	Enable debug output	-
`-h, --help`	Show help	-

Permission Modes¶

Mode	Description
`strict`	Prompt for everything
`interactive`	Prompt for dangerous actions
`auto-safe`	Auto-allow safe actions
`yolo`	Allow everything (dangerous!)

18. Dependencies¶

Package	Purpose
`@anthropic-ai/sdk`	Anthropic API client
`openai`	OpenAI API client
`tiktoken`	Token counting
`fast-glob`	File pattern matching
`ignore`	.gitignore/.agentignore parsing
`yaml`	YAML parsing for configs
`chalk`	Terminal colors
`ora`	Terminal spinners
`commander`	CLI parsing
`readline`	Interactive input

What's Next¶

This is the culmination of the educational journey. You now have:

Lessons 1-9: Core agent fundamentals
Lessons 10-22: Individual advanced features
Lesson 23: Execution Policies & Intent Classification
Lesson 24: Thread Management & Advanced Patterns
Lesson 25: Everything integrated into one powerful tool

Use this agent for actual coding tasks, or study the code to see how all the pieces fit together!