Code Reference¶

Source code for all examples and projects. View on GitHub

Shared (`src/shared/`)¶

Utilities used across all chapters and projects.

Module	Description
`model_client.py`	Thin wrapper over the LLM API: handles retries, timeouts, and response parsing
`types.py`	Shared data types: Message, ToolCall, ToolResult, AgentState
`config.py`	Environment configuration: model names, API keys, default parameters

Building blocks: tool registry, context engineering, the agent loop.

Module	Description
`tool_registry.py`	Tool registration, schema generation, dispatch, and permission checking
`tools/`	Individual tools: document_loader, chunker, retriever, extractor
`context.py`	Context window management: budget tracking, trimming, priority ordering
`agent.py`	Observe-think-act loop implementation
`run.py`	CLI entry point for Chapter 2 examples

Workflow-first architecture: fixed pipeline vs bounded agent.

Module	Description
`workflow.py`	Deterministic pipeline: retrieve, build context, answer in sequence
`agent.py`	Bounded agent: can refine search and escalate, with step limits
`state.py`	Explicit state management: tracks what the agent has tried and seen
`compare.py`	Side-by-side comparison: runs both implementations on the same queries

Multi-agent coordination: typed contracts, specialized agents, verification loop.

Module	Description
`contracts.py`	Typed message contracts for inter-agent communication
`agents.py`	RetrieverAgent, ReasoningAgent, VerifierAgent
`orchestrator.py`	Multi-agent coordinator with verification loop
`compare.py`	Single-agent vs multi-agent comparison runner

Human-in-the-loop: approval gates, escalation policy, structured audit logging.

Module	Description
`approval.py`	Approval gate middleware with auto-approve and human review
`escalation.py`	Escalation policy engine with per-risk-tier rules
`audit.py`	Append-only structured audit logging

Evaluation, observability, reliability, cost, and security.

Module	Description
`eval_harness.py`	Runs gold dataset through the agent, scores answers, produces failure buckets
`tracer.py`	Structured trace logging: captures every tool call, model call, and decision
`reliability.py`	Retry logic, timeout handling, fallback chains, circuit breakers
`cost_profiler.py`	Token counting, cost estimation, per-call and per-session tracking
`security.py`	Input sanitization, output validation, tool permission enforcement