Minions – Stripe's Coding Agents Part 2

Comments

By testfrequency 2026-02-2012:204 reply

How is this already #1 on the front page with 12 upvotes and 9 comments…

The article doesn’t reveal much. It feels like a fluff piece, and I can’t comprehend what the goal of sharing “we use AI agents” means for the dev community, with little to no examples to share. For a “dev” micro blog, this feels very lackluster. Maybe the Minion could have helped with the technical docs?

EDIT: slightly adjusts tinfoil hat minutes later it’s at #6

By nylonstrung 2026-02-2013:235 reply

It has all the trappings of NIH syndrome.

Reinventing the wheel without explaining why existing tools didn't work

Creating buzzwords ("blueprints" "devboxes") for concepts that are not novel and already have common terms

Yet they embrace MCP of all things as a transport layer- the one part of the common "agentic" stack that genuinely sucks and needs to be reinvented

By menaerus 2026-02-2017:201 reply

They mention "Why did we build it ourselves" in the part1 series: https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-...

However, it is also light on material. I would also like to hear more technical details, they're probably intentionally secretive about it.

But I do, however, understand that building an agent that is highly optimized for your own codebase/process is possible. In fact, I am pretty sure many companies do that but it's not yet in the ether.

Otherwise, one of the most interesting bits from the article was

> Over 1,300 Stripe pull requests (up from 1,000 as of Part 1) merged each week are completely minion-produced, human-reviewed, but containing no human-written code.

By tempest_ 2026-02-2017:271 reply

"human reviewed"

"LGTM..."

I feel like code review is already hard and under done the 'velocity' here is only going to make that worse.

I am also curious how this works when the new crop of junior devs do not have the experience enough to review code but are not getting the experience from writing it.

Time will tell I guess.

By menaerus 2026-02-2017:391 reply

Agents can already do the review by themselves. I'd be surprised they review all of the code by hand. They probably can't mention it due to the regulatory of the field itself. But from what I have seen agentic review tools are already between 80th and 90th percentile. Out of randomly picked 10 engineers, it will provide more useful comments than most engineers.

By tibbar 2026-02-2018:131 reply

the problem with LLM code review is that it's good at checking local consistency and minor bugs, but it generally can't tell you if you are solving the wrong problem or if your approach is a bad one for non-technical reasons.

This is an enormous drawback and makes LLM code review more akin to a linter at the moment.

By menaerus 2026-02-2018:231 reply

I mean if the model can reason about making the changes on the large-scale repository then this implies it can also reason about the change somebody else did, no? I kinda agree and disagree with you at the same time, which is why I said most of the engineers but I believe we are heading towards the model being able to completely autonomously write and review its own changes.

By tibbar 2026-02-2018:331 reply

There's a good chance that in the long run LLMs can become good at this, but this would require them e.g. being plugged into the meetings and so on that led to a particular feature request. To be a good software engineer, you need all the inputs that software engineers get.

By menaerus 2026-02-2018:49

If you read thoroughly through Stripe blog, you will see that they feed their model already with this or similar type of information. Being plugged into the meetings might just mean feed the model with the meeting minutes or let the model listen to the meeting and transcribe the meeting. It seems to me that both of them are possible even as of today.

By __float 2026-02-2016:06

What are the common terms for those? (I have heard "devbox" across multiple companies, and I'm not in the LLM world enough to know the other parts.)

By CuriouslyC 2026-02-2015:38

I was an early MCP hater, but one thing I will say about it is that it's useful as a common interface for secure centralization. I can control auth and policy centrally via a MCP gateway in a way that would be much harder if I had to stitch together API proxies, CLIs, etc to provide capabilities.

By croes 2026-02-2017:55

>Reinventing the wheel without explaining why existing tools didn't work

Won‘t that be the nee normal with all those AI agents?

No frameworks, no libraries, just let AI create everything from scratch again

By throwaway-aws9 2026-02-2015:07

resume driven development

By PunchyHamster 2026-02-2015:19

well, it's very important, now you know the financial code is handled by a bunch of barely supervised AI tools and can make decisions on whether to use product or not based on that

By netule 2026-02-2015:13

Stripe was launched through Y Combinator. It makes sense for their stuff to quickly bubble to the top of their news aggregator.

By BiteCode_dev 2026-02-2012:482 reply

Likely they have whitelisted domaine names that go straight to the home page. Would make sense to put all Y combinator ex and new startup sites.

Marketting is a major goal of HN after all.

By dewey 2026-02-2013:532 reply

Or the simpler explanation (which is probably closer to the truth): Stripe is a very popular company on HN as many people use them, their founders sometimes comment here and if they share their opinion on something people pay attention and upvote it.

By nullstyle 2026-02-2014:09

Or the even simpler explanation, that whenever Stripe posts a blog post, they have nine or 10 employees waiting to upvote it the moment it goes live.

By BiteCode_dev 2026-02-2017:351 reply

Doesn't explain how you get to the frontpage with less than 20 upvotes magically.

By steveklabnik 2026-02-2017:44

You only need about 4 upvotes in the first 20 minutes or so to get on the front page. It's the same for every story.

By handfuloflight 2026-02-2013:15

your absolut lee r8

By hibikir 2026-02-2015:351 reply

Stripe had invested a lot in dev experience over the years precisely because of how "unique" some of the technology choices were: Mongo and originally normal Ruby for a system that mainly deals with money? Without a massive test suite, letting a normal dev make changes without a lot of rails is asking for a sea of incidents. If I recall correctly, the parallelization needed to run the unit tests for developers used to make the cost of continuous integration higher than the cost of the rest of the ec2 instances. Add the dev boxes, as trying to put a useful test environment in a laptop became unreasonable, and they already start with a pile of guardrails tooling that other companies never even needed. A hassle for years, but now a boon, as the guardrails help the LLMs.

It'd be nice to get an old school, stripey blog post, the kind that has a bit less fluff, and is mostly the data you'd all have put in the footnotes of the shipped email. Something that actually talks about the difficulties, instead of non-replicable generalities. After all, if one looks at the stock price, it's not as if competitors are being all that competitive lately, and I don't think it's mainly the details of the AI that make a difference. It'd also be nice to hear what goes one when not just babysitting minions, if there's actually anything else a dev is doing nowadays. AI adoption has changed the day to day experience within the industry, as most managers don't seem to know which way is up. So just explaining what days look like today might even sell as a recruiting initiative.

By jimmydoe 2026-02-2013:41

This is a devops post. They just brag about the plumbing.

Dark secret of dark factory is high quality human input, which takes time and focus to draft up, otherwise human will end up multiple shot it, and read thru the transcript to tune the input.

Hacker News