The Missing Agent Stack: Identity + Durable Execution with LDP and JamJet
Three agents. One question. The router has to pick. Every multi-agent framework solves this the same way: match skill labels. The agent says “I do research,” the router says “good enough.”
Here’s what happens when you add identity — metadata about who each agent is, not just what it claims to do:
What is the capital of France?
0.936 quick-lookup [fast, cost=low]
0.618 domain-specialist [domain-expert, cost=medium]
0.442 deep-analyst [analytical, cost=high]
Analyze the impact of QE on emerging market debt sustainability
1.000 domain-specialist [domain-expert, cost=medium]
0.933 deep-analyst [analytical, cost=high]
0.606 quick-lookup [fast, cost=low]
What are the tax implications of Roth IRA conversions?
0.800 domain-specialist [domain-expert, cost=medium]
0.668 quick-lookup [fast, cost=low]
0.668 deep-analyst [analytical, cost=high]
Easy question → fast, cheap agent wins. Hard question → analytical agent wins. Domain question → domain expertise breaks the tie. All decided in ~2ms, no LLM call for routing. Without identity cards, the router sees three identical “research” agents and picks randomly.
This is actual output from a runnable example using LDP for identity and JamJet for execution. Let’s walk through how it works.
The Stack
LangGraph gives you execution. MCP gives you tool communication. A2A gives you agent-to-agent messaging. None of them tell you who the agent is, whether it should handle a given task, or what happened after it did. LDP fills identity and governance. JamJet fills execution with durability — crash recovery, checkpointing, cost caps enforced at the Rust layer. Together they cover the full stack.
Identity Cards
In A2A, an agent exposes a name and a skill list. In LDP, an agent carries an identity card — structured metadata about the model behind it. Here’s the deep-analyst agent:
AgentCandidate(
uri="jamjet://research/deep-analyst",
agent_card={
"name": "Deep Analyst",
"labels": {
"ldp.delegate_id": "ldp:delegate:deep-analyst",
"ldp.model_family": "Claude",
"ldp.reasoning_profile": "analytical",
"ldp.cost_profile": "high",
"ldp.latency_profile": "p50:8000ms",
"ldp.quality_score": "0.92",
},
},
skills=["deep-analysis", "reasoning", "multi-step"],
trust_domain="research",
)
A reasoning_profile of “analytical” means the model handles multi-step reasoning but is slow. A quality_score of 0.92 means it’s excellent but expensive. These fields are protocol-level — declared once, available to any router, no LLM call to access them. The quick-lookup and domain-specialist agents follow the same pattern with different profiles. Full definitions →
Routing Strategy
JamJet’s Coordinator supports pluggable strategies. We subclass DefaultCoordinatorStrategy to score agents using their LDP identity metadata. The core idea: classify task difficulty with heuristics (~2ms), then weight five dimensions differently based on difficulty.
class LdpCoordinatorStrategy(DefaultCoordinatorStrategy):
async def score(self, task, candidates, weights, context):
difficulty = classify_difficulty(task) # keyword heuristics, ~2ms
for c in candidates:
labels = c.agent_card.get("labels", {})
reasoning = labels.get("ldp.reasoning_profile", "")
quality = float(labels.get("ldp.quality_score", "0.5"))
# Hard tasks want analytical profiles + high quality
# Easy tasks want fast profiles + low cost
reasoning_fit = 1.0 if (
(difficulty == "hard" and reasoning == "analytical") or
(difficulty == "easy" and reasoning == "fast")
) else 0.5
# Dynamic weights shift by difficulty
ldp_weights = {
"capability_fit": 2.0,
"cost_fit": 1.5 if difficulty == "easy" else 0.5,
"historical_performance":
2.5 if difficulty == "hard" else 1.0,
}
# ... compute composite score
Easy tasks heavily weight cost (1.5) and barely weight quality (1.0). Hard tasks flip it: quality gets 2.5, cost drops to 0.5. Domain matches add a bonus. The result: a fast, cheap agent scores 0.94 on “capital of France” while the expensive analyst scores 0.44 — exactly what you want.
Full strategy implementation →
Provenance-Weighted Synthesis
Routing picks the best agent, but sometimes you want all perspectives. The example fans out to all three agents in parallel, then weights their answers using LDP provenance — metadata attached to every result about who produced it:
provenance = {
"produced_by": "ldp:delegate:deep-analyst",
"reasoning_profile": "analytical",
"quality_score": 0.92,
"confidence": 0.88, # self-reported for this specific answer
}
Three signals determine how much each answer influences the synthesis:
- Routing rank — who the Coordinator scored highest (1st = weight 1.0, 2nd = 0.5, 3rd = 0.33)
- Quality score — declared capability from the identity card
- Confidence — how certain the agent is about this specific answer
For the QE analysis question, the domain-specialist gets weight 0.86 (rank #1, domain match). The deep-analyst gets 0.74 (rank #2 but quality 0.92, confidence 0.88). The quick-lookup gets 0.41 (rank #3, confidence 0.45 — even the agent knows it’s out of its depth).
rank_weight = 1.0 / (1 + rank) # positional decay
combined = (rank_weight * 0.4
+ quality_score * 0.35
+ confidence * 0.25)
This implements the LDP paper’s RQ3 finding: structured provenance improves synthesis when signals are accurate. When provenance is noisy — inflated confidence, unverified claims — it actually degrades quality below the no-provenance baseline. Honest metadata beats detailed metadata.
Durable Execution
Identity-aware routing is a protocol concern. But once you’ve made a smart routing decision and are making real API calls, you need the execution to be reliable.
JamJet checkpoints every step automatically. If the deep-analyst crashes mid-analysis, the runtime resumes from the last checkpoint — it doesn’t re-route, re-execute the other agents, or lose their provenance. The synthesis step has everything it needs after recovery.
Cost caps are the other critical piece. LDP identity cards declare cost profiles, but declaration is not enforcement. JamJet enforces budget limits at the Rust layer — not a prompt that says “please stay under $2,” but an actual runtime constraint that terminates execution when the budget is hit.
Agent(
"deep-analyst",
model="claude-sonnet-4-6",
max_cost_usd=2.0, # hard stop, enforced at Rust layer
max_iterations=15,
)
The identity card says what the agent should cost. The runtime ensures it actually costs that.
Try It
git clone https://github.com/jamjet-labs/jamjet
python examples/ldp-identity-routing/main.py
Runs with mock responses, no API key needed. The output shows routing scores, each agent’s response with provenance, and the weighted synthesis.
Using real LLMs instead of mocks →
Set ANTHROPIC_API_KEY, uncomment the REAL_AGENTS block in main.py, and replace the mock execute_agent() with agent.run() calls. Same routing logic, same provenance, real model responses. See the README for details.
Where This Fits
LDP and JamJet aren’t competitors to A2A or MCP — they’re complementary layers. A2A handles agent-to-agent messaging. MCP handles tool access. LDP tells you who the agent is and whether it should handle a task. JamJet makes sure it executes reliably and within budget.
The example is deliberately simple — three agents, three questions. But the architecture scales: more agents makes identity-aware routing more valuable (the cost of misrouting increases), real API calls make durable execution essential (crashes happen), and cross-organizational agents make trust domains the security boundary that prevents untrusted models from handling sensitive tasks.
The protocol layer is where these properties should live. Not in application code. Not in prompts. In the infrastructure.
Related
- Why Multi-Agent AI Systems Need Identity-Aware Routing — the LDP protocol deep-dive
- From Debate to Deliberation — the DCI framework for structured multi-agent reasoning
LDP paper: arXiv:2603.08852. JamJet: jamjet-labs/jamjet. LDP reference implementation: ldp-protocol.
Want the visual version? See the interactive LDP paper with animated diagrams and experiment charts.