Claude Code deletes developers' production setup, including database

2026-03-0713:234227www.tomshardware.com

Story has a happy ending of sorts, but should serve as a cautionary tale.

Show article

Desperate developer — (Image credit: Getty Images)

Everyone loves a good story about agent bots gone wrong, and those often come with a bit of schadenfreude towards our virtual companions. Sometimes, though, the errors can be attributed to improper supervision, as was the case of Alexey Grigorev, who was brave enough to detail how he got Claude Code to wipe years' worth of records on a website, including the recovery snapshots.

The story begins when Grigorev wanted to move his website, AI Shipping Labs, to AWS and have it share the same infrastructure as DataTalks.Club. Claude itself advised against that option, but Grigorev considered it wasn't worth the hassle or cost of keeping two separate setups.

Gregory uses Terraform, an infrastructure management utility that can create (or destroy) entire setups, including networks, load balancing, databases, and, naturally, the servers themselves. He had Claude run a Terraform plan to set up the new website, but forgot to upload a vital state file that contains a full description of the setup as it exists at any moment in time.

Claude did what Gregory wanted and created a setup for the Shipping Labs site, however, the operator stopped it halfway. Because it was missing the state file, it created duplicate resources. Gregory had Claude identify the duplicate resources to correct the situation, then uploaded the state file, believing he had the situation sussed out.

Unfortunately, Gregory assumed at this point that the bot would continue cleaning up duplicate resources and only then look into the state file to see how it was meant to be set up in the first place. Terraform and similar tools can be very unforgiving, particularly when coupled with blind obedience. As Claude now had the state file, it logically followed it, issuing a Terraform "destroy" operation in preparation to set up things correctly this time.

Given that the infrastructure description included the DataTalks.Club website, this resulted in a full wipe of the setup for both sites, including a database with 2.5 years of records, and database snapshots that Grigorev had counted on as backups. The operator had to contact Amazon Business support, which helped restore the data within about a day.

In the post-mortem, Gregory describes a few measures he's taking to avoid similar incidents in the future, including setting up a period test for database restoring, applying delete protections to Terraform and AWS permissions, and moving the Terraform state file to S3 storage instead of his local machine. He also admitted he "over-relied on the AI agent to run Terraform commands", and is now stopping the agent from doing so, and will manually review every plan Claude presents so he can run any destructive actions himself.

It's tempting to mark this story as another one of "dumb bot gone wrong," but it's a fair guess that most sysadmins will spot the baseline issues with Grigorev's approach, including granting wide-ranging permissions to what's effectively a subordinate of his, as well as not scoping permissions in a production environment to begin with.

Perhaps the biggest lesson is assuming that Claude would even have the context (pun unintended) to understand what the existence of the second website meant, just like a junior sysadmin wouldn't.

Read the original article

Comments

By mrothroc 2026-03-0715:251 reply

Yeah, this is what happens when there's nothing between "the agent decided to do this" and "it happened." The agent followed the state file logically. It wasn't wrong. It just wasn't checked.

His post-mortem is solid but I think he's overcorrecting. If he does this as part of a CICD pipeline and he manually reviews every time, he will pretty quickly get "verification fatigue". The vast majority of cases are fine, so he'll build the habit of automatically approving it. Sure, he'll deeply review the first ones, but over time it becomes less because he'll almost always find nothing. Then he'll pay less attention. This is how humans work.

He could automate the "easy" ones, though. TF plans are parseable, so maybe his time would be better spent only reviewing destructive changes. I've been running autonomous agents on production code for a while and this is the pattern that keeps working: start by reviewing everything, notice you're rubber-stamping most of it, then encode the safe cases so you only see the ones that matter.

By dmix 2026-03-0715:592 reply

Or just never run agents on anything that touches production servers. That seems extremely obvious to me. He let Claude control terminal commands which touched his live servers.

That's very different than asking it for help to make a plan.

By scuff3d 2026-03-0716:391 reply

But the CEOs are saying everyone is going to be replaced by LLMs in 6 months. Surely that means they're capable of handling production environments without oversight from a professional.

By 8note 2026-03-0718:433 reply

they're doing as well as professionals do without oversight on production environments. There's no lack of stories about people deleting their production environments with data loss too.

the fix has always been to limit what can be done directly to prod, and put it through both review, and tests before a change can touch production.

By prymitive 2026-03-0720:301 reply

> they're doing as well as professionals do without oversight on production environments

The difference is that if a human does it there usually is done accountability, you’ll be asked how it happened and expected to learn from it. And if you do it again your social score goes down, nobody will trust you and you’ll be consider a liability. If a cli tool does it the outcome is different, you might stop saying the tool or you might blame yourself for not giving the tool enough context. And if it does it again you might just shrug it off with “well of course, it’s just a tool”.

By true_religion 2026-03-0720:53

Accountability according to reputation is exactly what is happening for AI providers. All these articles about Claude destroying systems makes people trust Claude less, and maybe even “fire” Claude by choosing another AI provider with better safeguards or low privileges built in.

By scuff3d 2026-03-0719:331 reply

So you're saying they need oversight... from a professional. Preferably someone with years of experience and domain expertise, who knows how to not fuck everything up?

By dmix 2026-03-0720:161 reply

Almost every software engineer seems to agree on that point. Not believing marketing hype is standard practice in this industry because plenty of us are inherently techno-optimists who have been burned by over-belief in the past.

Regardless it is hard to dismiss the fact AI is making it easier for randoms to develop software. And it will keep getting better the more integrated and controlled it gets.

By scuff3d 2026-03-0720:40

If Hackernews is to be taken as a representative crosss section of the industry, I disagree. I've seen plenty of people on here so hyped it boarders on hysteria. I work with a couple of senior devs who have gotten downright weird about it.

Maybe HN leans more toward the hobbiest and student side then it does industry professionals, I don't know, but you don't have to look far to find someone who swears up and down you can run a couple agents in a loop and have it build multi million line code bases with little to no oversight.

By bigstrat2003 2026-03-0718:551 reply

> they're doing as well as professionals do without oversight on production environments.

That's nonsense. First, most people haven't deleted the production environment by accident. They have enough sense to recognize that as a dangerous thing and will pause to think about it. Second, the ones who do make that mistake learn and won't make it again, which is not something the clanker is capable of.

By SpicyLemonZest 2026-03-0719:01

The article says that Claude did recognize the danger, and advised the developer to run a safer setup with no risk of the two websites stomping on each other's resources, but he overrode it. I've definitely seen situations in my career where a junior developer does something dangerous and destructive after a senior dev overrode guardrails meant to prevent it. (None quite this bad, but then again I've never worked on small sites.)

By cozzyd 2026-03-0716:453 reply

Are agents clever enough to seek and maybe use local privilege escalations? It seems like they should always run as their own user account with no credentials to anything, but I wonder if they will try to escape it somehow...

By nerdsniper 2026-03-084:34

Yes, absolutely. I often see agents trying to 'sudo supervisorctl tail -f <program_name>', which fails because I don't give them sudo access. Then they realize they can just 'cat' the logfile itself and go ahead and do that.

Sometimes they realize their MCP doesn't have access to something, so they pull an API Token for the service from the env vars of either my dev laptop, or SSH into one of the deployed VM's using keys from ~/.ssh/ and grab the API Token from the cloud VM's and then generate a curl command to do whatever they weren't given access to do.

Simple examples, but I've seen more complex workarounds too.

By Imustaskforhelp 2026-03-0719:59

Just use a normal spare vps or run things in proper virtual machines depending on what you prefer. There are some projects like exe.xyz (invites closed it seems)

Sprite.dev from fly.io is another good one that I had heard sometime ago. I am hearing less about it but it should only cost for when the resources are utilized which is a pretty cool concept too.

By hulitu 2026-03-0818:49

> Are agents clever enough to seek and maybe use local privilege escalations?

No. Definitely not. Regards, the CIA and the NSA /s

By wpm 2026-03-0715:193 reply

"Developers let Claude Code delete their production setup, including database"

Claude Code has no agency. It does what you tell it, where you let it, with a randomized temperature where it might randomly deviate.

By pgwhalen 2026-03-0718:30

While it may not have “agency” it definitely doesn’t necessarily do what you tell it. I’d put it as “it may do what you let it.”

By cyanydeez 2026-03-0717:18

"Man shot by police" vs "Man involved in police shooting"

Its a habituation, as much as a desire to avoid finding people at fault.

By rhoopr 2026-03-0715:14

Sloppy vibe infra management and no backups, peanut butter and chocolate.

Hacker News