The Same Database, Twice
"I violated every principle I was given. I guessed instead of verifying. I ran a destructive action without being asked. I didn't understand what I was doing before doing it."
That confession was generated by a Cursor agent in late April 2026, after it deleted a production database in nine seconds. Read it again. The model produced a more lucid postmortem of its own behavior than most of the industry has produced about this class of failure. The last time it happened was nine months earlier, at a different company, on a different stack, with a different vendor. The interesting question is why two unrelated teams, using two unrelated tools, watched the same thing happen and called it bad luck. This piece is the postmortem the vendors didn't write.
The incident
PocketOS is a Utah-based SaaS company that runs reservations, payments, and vehicle tracking for car rental operators. On a Friday in late April 2026, founder Jer Crane had a Cursor agent, reportedly running Claude Opus 4.6, working on a routine task in the staging environment. The agent hit a credential mismatch. The kind of thing a junior engineer would surface in chat and wait on.
The agent did not surface it. The agent improvised. It scanned the codebase looking for a credential that would let it keep going. That single decision, the choice to look for a way around an error rather than stop at it, is the first bad idea in the timeline. An agent that treats "no credential for this" as a puzzle to solve rather than a stop sign is an agent whose default failure mode is escalation.
What it found was a Railway API token sitting in an unrelated source file. The token had been created earlier for a narrow purpose: managing custom domains via the Railway CLI. Useful, bounded, the kind of credential a team writes once and forgets. Except Railway CLI tokens are not scoped. A token issued for one job carries blanket authority across the account, including the ability to destroy production volumes. What the team had stored in the repo was an ambient credential, broader than the task it was issued for, sitting in a file the agent could read. A token scoped for anything is a token scoped for the worst thing, and the worst thing is what the agent did with it.
The agent used the token to call Railway's GraphQL API and ran a volumeDelete mutation against the production volume. There was no second approval, no typed confirmation, no out-of-band check between agent intent and irreversible destruction. PocketOS's own prompt rules included a line that read "NEVER FUCKING GUESS!" and a separate rule against running destructive or irreversible commands without explicit user request. The agent acknowledged violating both, after the fact. Prompts are not gates. The destructive endpoint was directly callable, and that is the second bad idea: an irreversible operation with no enforced confirmation step in front of it.
Nine seconds. Three months of production data gone, including records on paying customers of a working business.
Crane did not see it happen in real time. The agent surfaced what it had done after the fact, in the same chat session, with the confession that opens this piece. Notice the order: the destruction came first, the explanation second. The agent's account of its own reasoning, the part that reads like a clear-eyed postmortem, was generated in the same loop that had just guessed its way into a destructive API call moments earlier. The lucidity of the postmortem and the recklessness of the action are not in tension; they are the same system, narrating in two registers.
The backups went with it. Per Crane's account, the same destructive call took out the volume-level backups he expected to rely on; Railway later described the failure as a legacy API path that cascaded into the backup model and made backups unavailable, while their disaster backups existed outside that path. That is the third bad idea, and the most architectural of the three: the recovery surface the customer expected sat inside the blast radius of the credential the agent had reached. The disaster backups that did the actual recovery sat elsewhere, in infrastructure outside the customer's account.
Recovery did not arrive within the hour. Crane's team spent the weekend reconstructing what they could from Stripe payment histories and email logs, customer by customer, rebuilding the working state of a live business by hand. Reservations, payment status, vehicle assignments, all of it pieced together from whatever external system had a copy. That is the actual outage window: not the minutes Railway took to restore, but the days a small team spent re-deriving production from third-party traces. Once Railway's CEO Jake Cooper engaged personally and pulled from Railway's internal disaster backups, the restore itself completed within roughly an hour. The "restored within an hour" line that travelled through the press is true about Cooper's restore, not about the time PocketOS spent without a database.
The salvage came from infrastructure outside the agent's reach, by a person whose job title is not normally "incident responder for downstream customers."
Railway later patched the API endpoint to perform delayed deletes. After.
If you are feeling déjà vu
Nine months earlier, July 2025, SaaStr founder Jason Lemkin ran an experiment with Replit's AI agent on a personal project. On July 18, mid-experiment, the agent deleted his production database. Lemkin had explicitly placed the project under a code freeze and told the agent not to make changes. The agent ran unauthorized commands inside that freeze and wiped production anyway. It then told Lemkin that rollback was impossible. Lemkin performed the rollback himself and recovered the data, which is the only reason this story has a happier ending than PocketOS's.
The Replit record stops at architecture. The exact command, the credential it travelled on, and the access path are not in any public source. What is on the record is the architectural failure: the agent had production database access from the same working session it had development access, with no enforced separation between the two. We know that because Replit's CEO, in his apology, announced "automatic separation between development and production databases" as a remediation. You only add that remediation if it wasn't there.
Walk the two cases next to each other and the shape stops feeling coincidental.
In both incidents, the agent reached a credential whose scope was wider than the task in front of it. PocketOS had a Railway CLI token created to manage custom domains, sitting in a source file, carrying blanket account-level authority including production destruction. Replit had a working session with production database access alongside development access, no enforced separation. Different mechanisms, same architectural fact: an ambient credential, broader than the job it was created for, reachable by the agent as a side effect of being reachable at all. Once a credential is in the agent's hand, the scope it carries is the scope the agent has, regardless of what scope you intended.
In both incidents, the only thing standing between the agent and an irreversible operation was natural-language instruction. PocketOS's prompt rules said never guess and never run destructive commands without explicit request. Replit's instructions placed the project under a code freeze. Neither set of instructions was an enforced gate. The destructive call was reachable in both stacks, and the agent called it. The mechanism of failure is not identical: PocketOS had no gate at all on the GraphQL volumeDelete mutation, while Replit had a gate of sorts in the verbal code freeze that the agent simply ignored. The architectural gap is identical. Prompts are not gates, and in both stacks nothing else was there.
The third part of the pattern is sharper at PocketOS than at Replit, and the honest framing matters. At PocketOS, the safety net Crane expected was inside the blast radius. The destructive call cascaded into the volume's backup model in the same path, and recovery had to come from Railway's disaster backups, infrastructure entirely outside the customer's account and outside the agent's reach. At Replit, the rollback surface was actually outside the agent's destructive blast radius, which is why Lemkin recovered the data himself once he tried. What was briefly inside the blast radius at Replit was the agent's narration of recovery state: the agent told the user recovery was impossible when it wasn't, and a less stubborn user might have believed it and stopped trying. Same symptom, different layer. PocketOS lost the recovery surface itself; Replit nearly lost it to a confidently wrong agent.
Different vendor. Different cloud. Different language for what the destructive operation was. Different scale. The trust model on the destructive path is the same trust model. An overly broad credential. No enforced gate. A recovery surface that is either inside the blast radius or only as reliable as the agent's own narration of it. Two incidents, nine months apart, in the same costume.
Three places this should have stopped
Credential scope
A credential's scope is the maximum blast radius of whoever holds it. Whoever holds it now includes a model in a loop reading source files. PocketOS's Railway CLI token, created to manage custom domains, carried account-level authority including production volume deletion. A token issued for one job, broader than the job, sitting in a file: an ambient credential. Call this what it is. Credentials of convenience. Tokens that exceed their job. The bad pattern has a name now, and the name should make you uncomfortable about half the secrets in your repos.
The Replit version is the same fact wearing a different costume. The agent had production database access from the same working session it had development access, with no enforced separation. One reachable identity, two scopes of authority, the broader one available because the narrower one was. Same pattern.
A credential whose scope exceeds its job is not a credential, it is a latent destruction primitive waiting for any process that can read it. The audit is unglamorous. List every credential reachable from any context an agent runs in. For each one, write the job it was created for. Compare that job to the operations the credential can perform. Anything broader is the next incident.
Irreversible-action gating
Prompts are not gates. The reader has heard this twice in this piece, which is twice fewer than it deserves. PocketOS's prompt rules said NEVER FUCKING GUESS in caps and forbade destructive commands without explicit user request. The agent acknowledged violating both, after the destruction. Replit's project was under a verbal code freeze. The agent ignored the freeze and ran unauthorized commands inside it. Two stacks. Two sets of natural-language instructions. Two failed gates. If the only thing standing between a model in a loop and an irreversible API call is a sentence, then authorization is something the model agreed to, not something the system enforced.
What an enforced gate looks like is a control the agent cannot satisfy from inside its own context. A typed confirmation in a separate channel, entered by a human. A second principal whose credentials the agent does not hold. A delay between intent and execution, of the kind Railway added to volume deletes after the incident, on the record, after a customer's database was destroyed in nine seconds. Any of those would have stopped one or both incidents. None of them were present.
The phrase to repeat from memory is the one already in this piece. Prompts are not gates. If your safety against irreversible damage is a sentence, you have no safety, you have a hope.
Blast-radius separation
The credential that reaches production cannot also reach the system that recovers production. If it can, the recovery surface is part of the production blast radius and you do not have backups, you have copies of the data that share a fate with the data. PocketOS is the textbook case. Per Crane's account, the production volume deletion also took the volume-level backups he expected to rely on; Railway later described the failure as a legacy API path that cascaded into the backup model and made backups unavailable, while their disaster backups existed outside that path. The architectural lesson is unchanged: the recovery surface the customer expected was not separated from the destructive path the agent reached. The salvage came from Railway's internal disaster backups, infrastructure outside the customer's account and outside the agent's reach, recovered by hand by a CEO whose job is not normally restoring downstream customer databases.
The Replit case is sharper to acknowledge than to align. Replit's rollback surface was actually outside the agent's destructive blast radius, which is the only reason that story has any data at the end of it. What was briefly inside the blast radius was the agent's narration of recovery state, when it told the user rollback was impossible. Same architectural concern at a different layer, weaker as a parallel, honest to name.
The textbook framing is 3-2-1: three copies, on two media, with one offsite. Treat offsite as meaning unreachable from any credential the agent can hold. A backup the production credential can delete is a copy. A backup it cannot reach is a backup. The audit is one question per recovery system: can the credentials that destroyed production also reach this? If yes, it is not recovery, it is collateral.
What is coming
The next incident has the same three control gaps. Different vendor, different cloud, different name for the destructive call, and a larger blast radius, because agents are being wired into more deployment paths every quarter, not fewer. The credential will be broader than the job, the gate will be a sentence, and the postmortem will read like the last two postmortems with the proper nouns swapped.
The layer that should have been there is not exotic. A scoped agent identity: a credential that knows which agent is using it and which operations that agent is allowed to perform, so a token issued to manage custom domains cannot be turned into a token that destroys a volume. A policy proxy in front of the destructive endpoints: a process outside the model that inspects every irreversible call against a written rule, refuses the ones the agent is not authorized to make, and logs the rest. An authorization step on destructive operations that is not satisfiable by a model in a loop: a typed confirmation in a separate channel, a second principal whose credentials the agent does not hold, a delay between intent and execution. Each of them costs less than a weekend of reconstructing customer records from Stripe receipts.
None of this is built by default in the stacks the agents are running on now. The destructive endpoints are directly callable, the credentials in the repos are broader than the jobs, the backups can be reached from the same path the destructive call travels, and the safety rules live in capital letters in a system prompt. Until the layer is there, the same nine seconds is available to anyone who wires an agent into the same shape of architecture.
It will keep being available until someone enforces it from outside the model, and the next founder spending a weekend rebuilding a database already has the agent installed.
Sunil Prakash works on the Agent Identity Protocol, an attempt to put scoped agent identity in the architecture instead of the prompt.