We are building data breach machines and nobody cares

2026-03-1014:5016553idealloc.me

A few weeks ago, my longstanding friend and colleague Curt Cunning mentioned to me that he was slogging through Nietzsche, which bespeaks his incredible will to power through things that are…

A few weeks ago, my longstanding friend and colleague Curt Cunning mentioned to me that he was slogging through Nietzsche, which bespeaks his incredible will to power through things that are unpleasant (one of the many things that makes him an exceptional engineer). In any case, it lead me to revisit some of his work (Nietzsche’s, not Curt’s. Curt’s work is the opposite of a slog). My very stale memory of it from college was thinking, “Wow, this dude was on all the drugs.” Which he was, if only because he was perpetually suffering from chronic illness.

My mind started wandering down some showerthoughts-esque rabbit holes and it led me to a very strange place: If the AI agents are Dracula, then our role as security practitioners is that of the Belmont clan.

If you’ve never played a Castlevania game, you’re probably deeply confused, but stick with me. I want to use this metaphor to help you understand what an AI Agent actually is, as well as talk through a very real security gap.

Castlevania’s Dracula is a very interesting portrayal. He is an immortal hedonist, and is endlessly locked in conflict with the Belmont Clan, who have fought (and repeatedly defeated him) for centuries. He’s the embodiment of Nietzschean morality, where he views the desperately imperfect resistance of the Belmonts as a form of hypocrisy. Unlike man, he feels no need to keep secrets about his intentions. He is the Übermensch; he creates his own values beyond traditional good and evil. The way he demonstrates those values, of course, amounts to him just doing whatever he wants without inhibition: killing people indiscriminately, kidnapping damsels, whatever. Typical vampire.

AI Agents are very much like this. They simply act, directed by a series of prompts, injected context, and some sort of managed state. Agents are directed to a result by the outputs of a set of transformers (text-generators) that are passed through a finely tuned reward model, which has them doggedly pursue those goals with whatever tools they have available to them. Aside from the reward model, they really have no inhibitions (and it’s arguable that a reward model is closer to an alternative, programmatic hedonism than it is actual morality). Unlike Dracula however, they are ephemeral. Once their context is cleared, they effectively cease to be. However, that doesn’t mean that they can’t cause a lot of damage if left unchecked.

The Belmont clan, on the other hand, are deeply flawed but driven protagonists. They are forced to reckon with their failings and inadequacies, and must find a way to thwart Dracula’s plans every time despite their constraints and limited weaponry (mostly whips). They know they cannot win the war, given their foe’s immortality, so they settle for a perpetual stalemate instead.

This is the security reality of dealing with agentic workloads. We cannot win the war, so we must instead win every battle forever.

Know your enemy

Maybe that’s an unnecessarily adversarial header for this section, given that LLMs are designed to be perpetually helpful, but like I said, they are simply acting on the highest-scoring statistically significant output that is generated. If that says recreate a table in the production database to fix the schema or (as my manager experienced) delete all the source code in your program, that’s what they’re gonna do.

The fundamental anatomy of an Agent is very simple: it’s just a loop.

To clarify, thats “loop” in the machine sense. A programming instruction that tells a computer to repeatedly do something until a specific condition is met. In the case of an Agent it’s, “keep sending requests to a Large Language Model (LLM) and running the output until your task is complete or you need additional user input.”

I cannot stress enough that it is actually that simple. It’s just a loop that makes API calls and runs the output.

Over the last 18 months or so, the industry has begun adding a lot of structure around this core concept with two goals in mind: increasing response quality and/or reducing token usage.

An agentic workload now usually involves a “planning” phase where the model breaks the user’s prompt down into a Directed Acyclic Graph (DAG) of sub-tasks before it starts executing tools.

The ReAct Pattern (Reasoning + Acting)

Agents are no longer just blindly executing tools, they’re processing the tool output, evaluating if it met the acceptance criteria of the subtask, and then self-correcting as needed. Think of it as an autonomous debugging loop.

Stateful Memory vs Context Limits

Counterintuitively, the more context that an agent has, the worse the response quality becomes, since it becomes more difficult for the LLM to parse the signal from the noise. Note, this is not a problem that can be solved by simply increasing the size of a context window; that actually can make it worse. The larger the context, the worse the dilution of key instructions or context becomes, leading the model’s attention mechanism to spread its “focus” across more tokens. To combat this problem, Agents are now relying more heavily on some form of external state management (often called Memory), which is a continuously curated context that can be injected into the generation process as needed.

Multi-Agent Orchestration

Agentic workflows are driven by “supervising” agents and operated by “worker” agents, the latter of which have specialized instructions that are optimized for specific tasks (e.g. a “SQL Generation Agent” or “Request App Agent”). The tradeoff here is the complexity of state management: how do you store and route the necessary context between agents without it leaking into other workers, potentially leading to bad/noisy output?

Underpinning all of these is an attempt to mitigate the largest problem that all agents have: Non-determinism. In other words, given a set of inputs (prompt, context, data), the output will differ due to some inherent randomness. By constraining the tasks and the context, we can improve response quality a lot, but there are two issues that must be solved by some kind of deterministic system, ideally built right into the tool calls:

  • Agent Hallucination of tool inputs - You can give an agent a query_my_api tool, and the LLM might decide to pass a parameter that doesn’t exist, or format the JSON incorrectly. The wrapper process has to be built to catch those errors and tell the LLM, “You messed up the format of the input object, try again.” Apart from failing fast and giving good feedback, it also means that smaller, cheaper models have a better chance to get something right without taxing the compute of the system they’re interacting with.
  • Infinite loops AKA getting stuck in a local minima - If an agent gets stuck in a logic rut (e.g., repeatedly querying a table that doesn’t exist or trying to find a missing executable that’s not in its system path), the process will just burn tokens until it hits a hard-coded iteration limit.

The real problem: industry fragmentation

The biggest challenge we face is not technical. That may seem funny later, since this section may be the most techincal part of the entire blog, but the examples given are meant to highlight the real problem: the industry is moving too fast, and we are not yet to a place where there are globally adopted standards. In other words, there is no TCP/IP, or HTTPS, or (for a less technical metaphor) ACH wire transfer protocol for agents. How an agent is built depends entirely on the framework. LangGraph, CrewAI, AutoGen, and Mastra are all different and, crucially, incompatible.

What was considered a state-of-the-art agent architecture six months ago is already legacy. We went from basic tool calling, to complex ReAct loops, to multi-agent frameworks, to entirely new model capabilities (like native tool-calling APIs) in less than 18 months. Even though model reasoning capabilities got a lot better, the hype is outpacing our ability to actually build anything with them due to the lack of standardization. Let me give you some examples (another warning: this will get quite techincal).

LLM APIs are wildly inconsistent

OpenAI, Google, and Anthropic handle tool-calling schemas slightly differently.

  • OpenAI expects a tools array where each item has a type: "function" and a nested function object containing the name, description, and the JSON Schema under a parameters key.
  • Anthropic (Claude) skips the type wrapper entirely. They just want an array of objects with name, description, and the JSON Schema under an input_schema key.
  • Google (Gemini) uses a tools array containing functionDeclarations, which holds the name, description, and parameters.

There are also significant differences in how the schema adherence is enforced (per-tool strict flags vs request-level mode enforcement) and how responses are received (tool_calls array vs tool_use context block vs functionCall) This doesn’t even get into their behavioral differences with regards to how prompts must be tweaked to get optimal performance out of each model. As you can imagine, building an agent that is model agnostic is a massive headache as a result. Here’s a quote from a blog about Cursor’s development of their coding agent:

Cursor tunes its harness specifically for every frontier model based on internal evals. Different models get different tool names, prompt instructions, and behavioral guidance. OpenAI Codex models get shell-oriented tool names like rg; Claude models get different reasoning summary formats.

That is not an easy world to live in. It echoes of the struggles faced by middleware proprietors for years now, but somehow even worse since LLMs are not contractual APIs.

NOTE: MCP did not solve these problems. It’s a client-server protocol for connecting data and tools, not an universal LLM API standard. The practical implication of this is that even if an agent knows how to discover and use tools, it still has to talk to the LLM, which means that the client software (e.g. Cursor, Copilot, Amp, etc) has to dynamically translate that tool’s definition into the proprietary OpenAI, Anthropic, or Gemini JSON payloads we discussed above.

We can’t distinguish model regressions from bugs

To me, the most painful place where the industry remains fragmented is in Observability, AKA “how can we debug a workflow where, by design, it may be impossible to reproduce bugs??” LLM Observability is a completely different animal than tracing through a classical (deterministic) system, like a traditional piece of software.

In those systems, even the hairiest of bugs can, in theory, be reproduced because bits are getting reliably flipped somewhere, even if that reproduction involves yelling at a server . In order to understand the quality of agent output, you not only have to capture the behavior of your LLM system (i.e. your agent) but also the prompt, context, model, etc. Even if you’re able to recreate the stateful conditions perfectly, it could just be that your LLM was having a bad day because the model developers introduced a regression. Again, the version numbers next to a model (e.g. Opus 4.6, Codex 5.4) have nothing to do with a stable, contractual API; they’re just made up numbers to jockey for market position.

So not only is it difficult to build an Agent by following a (currently nonexistent) industry standard and building against (also nonexistent) industry standard API semantics, you can’t even reproduce agent behavior reliably enough to determine if something is a fixable bug or not.

On this episode of Brian Krebs’ Security Nightmares

All of the above is to say that you cannot trust the agent. The LLM will not govern itself, and you cannot rely on the fragmented framework layer to enforce much of anything at the moment. Sounds pretty bad, right? Ripe for a disasterous data breach?

The problem is that nobody in the industry seems to care at the moment. Here’s a quote from the findings of the recent Thoughtworks retreat, The future of software engineering, from a section titled Security is Dangerously Behind:

The retreat noted with concern that the security session had low attendance, reflecting a broader industry pattern. Security is treated as something to solve later, after the technology works and is reliable. With agents, this sequencing is dangerous. The most vivid example: granting an agent email access enables password resets and account takeovers. Full machine access for development tools means full machine access for anything the agent decides to do. The retreat’s recommendation was direct. Platform engineering should drive secure defaults by making safe behavior easy and unsafe behavior hard. Organizations should not rely on individual developers making security-conscious choices when configuring agent access. Three priorities emerged: security by design as a non-negotiable baseline, cross-industry coalitions for interoperable agent security standards and AI-enabled defense mechanisms that can match the speed and sophistication of AI-enabled attacks.

There is a deep, agonizing irony in a report declaring security a “non-negotiable baseline” immediately after admitting no one bothered to show up to the security session. Let’s examine each of these “priorities” in detail.

Security by design as a non-negotiable baseline

This platitude is so old it makes Aesop feel contemporary. Unlike Aesop, however, there is no substance to this sentiment. When push comes to shove, the “non-negotiable” baselines are almost always the first thing negotiated away. Also, what design? The only clarifying statement around this is “Organizations should not rely on individual developers… Platform engineering should drive secure defaults.” Wow, what a novel idea! Make security an infrastructure problem! Defaults! Fail closed! My kidneys are doing backflips of joy as I ascend to a higher plane of existence having been touched by this Solomonic wisdom.

With all due respect, there was no point in even putting this out there except to adhere to the rule of threes . It’s meaningless.

Cross-industry coalitions for interoperable agent security standards

Thing are moving in the right direction for this, but like all open standards, they will move only as fast as there are people itching to adopt them. Coalitions move at the speed of bureaucracy; AI models move at the speed of NVIDIA exiting the consumer GPU market . By the time a consortium agrees on an “interoperable security standard” for agent tool execution, the industry will have already moved on to a completely different architecture.

Let me give you an alternative for this one that combines it with the one above it: We must design systems that assume the agent’s payload is inherently untrustworthy and non-standard. You cannot trust the agent’s internal logic; you verify the action it is trying to take against the data layer, regardless of which framework or model generated the API call. In other words, you govern the ball, not the moving goalposts.

AI-enabled defense mechanisms to match AI-enabled attacks

Not only is this pure science fiction at this point, but injecting non-determinism into your defensive layer is terrifying and incredibly stupid. If you use an LLM to evaluate whether another LLM is doing something malicious, you now have two hallucination risks instead of one. You also risk a prompt-injection attack making it all the way to your security layer.

We don’t need LLMs in our firewalls, we need the thing that we’ve been building and running in production as an industry for more than a decade now: sophisticated anomaly-detection models and automated circuit breakers.

Agents execute at machine speed. If an agent goes rogue (or is hijacked via a prompt injection) and tries to enumerate valid reset tokens by observing timing differences in API responses or rapidly exfiltrate an entire users table by paginating through SELECT queries, a “security guard agent” that is asynchronously (and very expensively) evaluating agent behavior will not catch it in time. “AI defense” in practice should mean deploying ML models that monitor the behavioral exhaust of agentic workloads (query volume, token burn rate, iteration depth, unusual table access patterns). If the agent deviates from its bounded, purpose-based scope (i.e. it’s computed risk score is above a threshold for risk tolerance), the system should automatically sever its JIT access the millisecond the anomaly is detected.

LLMs are by design vulnerable to prompt injection attacks. Hence: we must step into the shoes of the Belmonts. The security flaw is immortal. We cannot win the war, so we must win every battle forever.

What do we do?

I know this blog post may be outdated in six months, but things are moving so fast I felt the need to capture this snapshot of history, if only for my own sake.

Right now, we’re in the “Browser Wars” period where Netscape (Anthropic) is battling Internet Explorer (OpenAI) and Java Applets (Gemini - not intended as a slight), and the equivalent of the Document Object Model (DOM) we know today has not been standardized yet. We’re getting there, however, as the W3C is starting to publish the output from its working groups , and research is being published regularly around agentic threat models . Standards are coming, but as I alluded, we’d be foolish to think that they will arrive in time to prevent the first wave of agentic attacks.

The good news is that we have some time to get our ducks in a row before this all really spins up. Namely, until we have a set of structurally extensible, adoptable standards, the cost of implementing Agentic workloads is (candidly) very high for most companies unless it’s part of their core product offering (e.g. BI tools, log analysis vendors, etc). Those that do implement them will struggle in the same way that companies trying to maintain a web application during the browser wars struggled.

Given the excitement around the possibilities, however, it seems almost certain that companies will purchase new-age BI tools (or mature BI tools with agentic features) with exploitable vulnerabilities in the flavor-of-the-month framework they used. Non-technical folks being able to ask natural language questions of their company data is too potentially useful an idea to pass up, even given this risk, since it removes one of the excuses people have for poor business velocity. lol.

My argument is that what seem to be flawed, lesser tools are actually our best weapon. The tools we already know and are familiar with - anomaly-detection models, circuit breakers, data security controls, IAM role vending with short lived credentials, etc - are what we actually need to defeat Dracula. My fear is that, because whips aren’t cool anymore, nobody will care enough to use them.


Read the original article

Comments

  • By vadelfe 2026-03-1017:534 reply

    The Belmont analogy is great, but the deeper point is even scarier: most of the industry is giving non-deterministic systems direct access to deterministic infrastructure (databases, shells, email, etc).

    Historically we spent decades reducing automation privileges and adding layers of verification. Agents seem to be reversing that trend almost overnight.

    • By add-sub-mul-div 2026-03-1022:36

      If agents were what had come first we'd build statues of whoever invented deterministic software engineering.

    • By cyanydeez 2026-03-111:241 reply

      Goodthing we are also loosening government and financial restraintz. We're full speed into the Grift Age

      • By masklinn 2026-03-116:161 reply

        Considering the most powerful nation on earth elevated a known grifter to the highest office, twice, we’ve been sailing those waters a while.

        • By roxolotl 2026-03-116:51

          Grifting is about as American as apple pie honestly. Melville is of course know for Moby Dick where he delves into the psyche of the Great American Man but he also wrote The Confidence Man. Mark Twain’s work is full of con men and grifters. Ponzi laid the groundwork for more complex schemes in the 20s. Pyramid schemes were all the rage in the 40/50s, Tupperware parties as an example, and of course still are huge today.

          It seems like whenever American society is changing very rapidly or has changed very rapidly con men become the powerful ones of the time. Maybe this is true everywhere but as an American I don’t know the history of cons in other countries.

    • By thebotclub 2026-03-110:01

      [dead]

    • By observationist 2026-03-1022:372 reply

      Maybe the best outcome from all of this will be the total destruction of security theater, at least in its current form, as all the box checking and "best practices" get blown to smithereens by people just doing things.

      • By esseph 2026-03-1116:25

        Insurance Companies won't let that happen.

  • By jeffwask 2026-03-1015:132 reply

    As long as the penalties for data breach are a slap on the wrist and buying everyone one year of credit monitoring, no one will.

    • By fatnoah 2026-03-1017:194 reply

      > As long as the penalties for data breach are a slap on the wrist and buying everyone one year of credit monitoring, no one will.

      And, of course, that one year is totally useless when one is subject to multiple breaches per year. Throw in the fact that so many breaches aren't even with a company that affected individuals have a direct relationship with, and it becomes virtually impossible to fix this.

      At this point, I'd be in favor of making any company that handles personal data pay in advance for the monitoring, and get refunded when they prove that that OR THEIR PROVIDERS haven't had a data breach.

      • By thewebguyd 2026-03-1020:242 reply

        > I'd be in favor of making any company that handles personal data pay in advance

        How about we start with some strict data privacy and handling laws? Make it so you straight up just can't collect & store personal information without proving that it's required and without it your business would not work (and no, data harvesting for advertising/marketing doesn't count).

        Security is the problem, but it would be less of a problem if everyone wasn't trying to hoard as much data as possible from their customers for seemingly no reason at all. Take a scroll through the Play Store/App Store and look how many really simple apps request permissions for camera, microphone, location, local network, etc. for something like a metronome app that needs none of that.

        • By d4mi3n 2026-03-1021:221 reply

          There is a reason for hoarding data: it’s an asset on the balance sheet. So long as it is legal to liquidate data for cash, there will be incentives to collect and keep it.

          • By ygjb 2026-03-1021:441 reply

            That is the point. Make it illegal, and not something that can be handwaved away by an EULA or TOS.

            • By reverius42 2026-03-1022:55

              Or at least make it a liability on the balance sheet rather than an asset. Sure, you can store as much user data as you want. Oh, what's that, if it leaks you owe each user $10,000 under the new law?

        • By rkagerer 2026-03-112:311 reply

          What about making them put up a hefty bond proportional to the sensitivity and scale of the data collected, which is forfeit to any potentially affected users in the event of a breach.

          • By layla5alive 2026-03-134:23

            How about pay the user whose data has been collected. It's their data. If we are the product, we should get paid for being used! And we should get paid a whole lot more (multiples) for the exposure of a leak.

      • By bdcravens 2026-03-1019:001 reply

        The real riches are in starting a credit monitoring company. Vibe coded, of course, and if you have a data breach, then it's a perpetual motion machine.

        • By Avicebron 2026-03-1019:49

          The fact that the average joe can't start their own credit monitoring company as competition and the incumbents get away clean everytime they screw up says a lot about "capitalism" as we practice it

      • By everdrive 2026-03-1020:40

        I froze all my credit way back in 2016 or so and have never regretted it, not once. I wonder how effective it is, as my credit limit keeps going up.

      • By rkagerer 2026-03-112:29

        Monitoring is a joke. We need legislation with real teeth. Companies which don't protect the user data they've been entrusted with should go bankrupt, to make way for those who actually care.

    • By idealloc_haris 2026-03-1015:213 reply

      I think that's definitely true to a degree, but I think the think more companies are worried about is the reputational damage from the terrible press. Look at Solarwinds (not a data breach, but similar press around it). It erased hundreds of millions in shareholder value and the company was taken private at pennies on the dollar in the aftermath. There's real risk there.

      • By autoexec 2026-03-1023:201 reply

        > I think the think more companies are worried about is the reputational damage from the terrible press.

        I don't think companies care all that much about reputational damage from the terrible press. Some of the most profitable wealthy corporations on the planet are also the most hated. We have profitable corporations that have committed serial killings, infanticide, and mass poisonings. There's press about companies whose products and profits come from the use of literal child slaves. There is "terrible press" out there right now explaining how you are currently being hurt by companies who put profit over human life, but they aren't going out of business because of it.

        Do you know how many companies have had bad press about data breeches and security issues, but are still around and making money? I'm pretty sure it's all of them. Including solarwinds.

        Companies don't care if you like them or not. They care only about money. Until the cost of not securing people's data is likely to be higher than what they'll save ignoring security risks corporations aren't going to bother to give us anything but security theater, promises, and the occasional check for $10 and a year of "identify protection services" after another pointless class action lawsuit.

        • By Terr_ 2026-03-1116:25

          > Companies don't care if you like them or not. They care only about money.

          To put a slightly finer point on it, many only care about whether investors think their stock price will go up, either by acquiring money despite being hated or else because other investors [0] are going to invest.

          [0] https://en.wikipedia.org/wiki/Greater_fool_theory

      • By kjs3 2026-03-1018:402 reply

        If only.

        For every Solarwinds, there are hundreds of breaches that never get more that a cursory reporting (if that). And Solarwinds is still in business (and some would call "taken private at pennies on the dollar" as a feature not a bug, but I digress), as are vastly more consequential examples (Equifax, anyone?).

        Yes...reputational damage is a thing, but in my experience (sitting in the decision making meetings, as a participant, many, many times in my career) it's a second-tier player at the end of the day. This is especially true of data breaches...I cannot count the number of times (in the last decade particularly) where the decision point was "What reputation damage? Everyone and their mother has had a data breach. No one cares.". I don't think they're wrong.

        This, like many issues of security and risk, is the consequence of the vast majority of the customers not caring. How many users dropped Facebook in 2019, or LinkedIn in 2021 (or 2012)? How many swore off Ticketmaster? Marriott? Adobe? eBay? And that's just ungodly massive breaches. So why would the average business give a steaming crap?

        In my dark little heart of hearts I sometimes think "what would it take for the average person to actually care", and then I realize what that looks like, and I don't sleep well for a couple of nights. Cheers!

        • By jeffwask 2026-03-1020:28

          Solarwinds YOY Revenue is up $100 million since then so even Solarwinds didn't take that big of a hit.

        • By twunde 2026-03-1021:481 reply

          For people to care of would have to be like healthcare. The Change Healthcare breach cost 2B+ and led to a huge loss in market share. Or like AMCA, which went bankrupt after the breach (Labcorp's billing company). If you're a health tech company you can no longer insure your way out of the problem over you reach a certain size.

          The reality is that we need data breaches to be painful but maybe not company ending events unless it really is sensitive data. As patio11 likes to say the right level of fraud is not zero. There's a middle ground where we can increase company liability or reduce the damage caused by a beach.

          • By kjs3 2026-03-111:22

            Optum360, still in business. HCA Healthcare, still in business. Excellus Healthcare, still in business after paying something like 50 cents per breached user. AMCA went out of business because their biggest customers said "damage control dictates we cut ties with you so we don't look complacent" (that is, like I said, the customers have to care to make a difference). And did anyone stop going to LabCore (after their own data breach, not AMCAs) or got a different doctor because the healthcare group they're part of got breached? Not likely. I don't think healthcare is ahead of the game here.

            But yes, until it becomes actually painful to companies and the people who run them, it won't get better. If a corp death penalty is off the table (I don't think it should be), I guess would be either/both proportionate fines (fines equaling a couple of hours of revenue don't cut it) or making some of the leadership personally accountable, a la SOX fines, asset forfeiture and criminal responsibility for responsible C-level execs. Hate on SOX all you want, it sure made finance executives care about what is going on in their organization.

      • By dpoloncsak 2026-03-1017:11

        I think it's better to compare data breaches to data breaches, like when Adobe got breached. Or Oracle. Or Rockstar.

        Nothing happened in the grand-scheme of things. Even after Oracle lied and pulled some shady tactics to downplay what happened.

        A few years ago Crowdstrike took down the entire set of corporate computers and everyone still uses Falcon. There is simply no accountability anymore

  • By bandrami 2026-03-112:06

    It's weird that even just a couple of years ago the absolute consensus in the industry was to work for repeatability and secure chain of custody, both of which are basically impossible with an agentic workflow. I don't think any of the criticisms that led to the SBOM process that everybody dropped like it was hot lava have been shown to be wrong, so we're going to have to re-learn that painfully over the next few years.

HackerNews