Writing
On the decisions that shape enterprise AI, data platforms, and technology organizations.
Multi-Agent AI
The Agent Identity Gap: Why MCP and A2A Need Verifiable Delegation
A scan of ~2,000 MCP servers found all lacked authentication. OAuth 2.1 covers single-hop auth, but multi-agent delegation chains need more. How verifiable identity connects to routing, provenance, and reasoning.
ReadWhen Smarter AI Isn't Worth It
Most teams over-engineer simple tasks and under-engineer complex ones. A framework for matching AI architecture complexity to task complexity — with data on when 62x more compute is the right call.
ReadThe Missing Agent Stack: Identity + Durable Execution with LDP and JamJet
Code-first walkthrough: build identity-aware agent routing with LDP identity cards and JamJet's Coordinator. Difficulty-based scoring, provenance-weighted synthesis, runnable example.
ReadFrom Debate to Deliberation: When Multi-Agent Reasoning Needs Structure
Multi-agent debate is unstructured and lossy. DCI introduces typed reasoning moves, preserved disagreements, and guaranteed convergence — with honest results on when the 62x token cost is justified.
ReadWhy Multi-Agent AI Systems Need Identity-Aware Routing
Current agent protocols treat all models as interchangeable black boxes. LDP introduces identity-aware delegation — and the results show where it matters and where it doesn't.
ReadFrom NER Pipelines to LLM Agents: How Production NLP Changed in Seven Years
Seven years from BiLSTM-CRF at a startup to multi-agent protocols — what changed in production NLP, what stayed the same, and what the arc tells us about where we are going.
ReadEnterprise AI & Governance
Your ML Risk Framework Wasn't Built for GenAI. Here's What's Missing.
Why traditional model risk management fails for LLMs — hallucination policy, GenAI risk dimensions, deployment gates, prompt audit trails, and what a complete governance framework looks like.
ReadThe Year LLMs Met Compliance — And Compliance Wasn't Ready
GPT-4 is genuinely capable. Enterprise wants to use it. But the governance frameworks built for classical ML — model validation, risk management, audit trails — were not designed for non-deterministic models with emergent capabilities.
ReadGPT-3 Changed the Game — Is Enterprise Ready?
175 billion parameters, few-shot learning, and an API. GPT-3 shows where language models are going. But cost, latency, data privacy, and governance stand between the demo and production.
ReadAI Leadership & Strategy
Most Companies Know Their AI Spend. Almost None Know Their AI Readiness.
Most organizations know their AI spend. Almost none can answer basic questions about whether they’re ready to absorb what they’re buying. A diagnostic across five readiness dimensions, and a 10-minute assessment tool to find the gaps.
ReadThe Middle Management AI Gap
Boards are excited about AI. Engineers are building it. But the layer between the boardroom and the engineering floor is where most enterprise AI programs quietly die — and nobody talks about it.
ReadWhy Most Enterprise AI Programs Fail Before They Start
The failure mode isn't technical. It's organizational. Most enterprises skip the operating model and jump straight to tooling — then wonder why nothing scales past the pilot.
ReadThe AI Center of Excellence Is Dead. Long Live the AI Operating Model.
Centers of Excellence centralize capability. Operating models distribute it. The difference determines whether AI scales or stalls at the enterprise level.
ReadWhy I Chose Regulated AI Over Startup Speed
After three years building AI at a startup, I moved to a global bank. Not despite the constraints — because of them. Regulated environments are where AI gets hardest and most valuable.
ReadData Architecture
Why We Chose dbt Over BigQuery Dataform
JavaScript supply chain risk, SQL-first philosophy, multi-warehouse portability, and what actually matters when choosing a transformation framework in a regulated enterprise.
ReadData Vault 2.0 in Banking: Architecture for the Audit That Hasn't Happened Yet
Why insert-only modeling, hash keys, and full lineage aren't academic exercises — they're survival patterns in regulated environments where every record must be defensible.
ReadBuilding Data Foundations While Everyone Chased Models
The industry is obsessed with models. Meanwhile, the actual bottleneck for enterprise AI is data infrastructure — Data Vault 2.0, BCBS 239, lineage, and the unsexy work that makes everything else possible.
ReadNLP & Machine Learning
We Switched to PyTorch in 2020. Was It the Right Call?
Four years after switching from TensorFlow to PyTorch during the pandemic — what happened to TF, why PyTorch won, and why inference engines like vLLM matter more than training frameworks now.
ReadRAG in Production: What Breaks When You Move Past the Tutorial
Every RAG tutorial works. Then you move to production and discover that chunking strategy matters more than embedding models, retrieval quality matters more than generation, and evaluation is the hardest problem.
ReadSwitching from TensorFlow to PyTorch — A Practical Assessment
We switched our NLP projects from TensorFlow 1.x to PyTorch mid-pandemic. Eager execution, HuggingFace, debugging, and what we gave up. A practical framework comparison for a small team.
ReadBuilding NLP Pipelines Before Transformers Were Easy
What production NER actually looks like: BiLSTM-CRF, character embeddings, gazetteers, and the pipeline engineering that matters more than the model. Lessons from building NLP systems at a startup.
ReadAttention Is All You Need — A Practitioner's Guide to the Transformer
Breaking down the paper that replaced RNNs, LSTMs, and sequence-to-sequence models with a single mechanism: attention. What the Transformer does, why it works, and why it matters.
ReadClassifying 7,000 Product Codes from Four Words of Text
Comparing LSTM, BiLSTM, GRU, CNN, and hybrid architectures for HS code classification from short product descriptions. What worked, what didn't, and why CNNs beat RNNs on short text.
Read