Hacker News

My AI skeptic friends are all nuts

2025-06-0221:0923562827fly.io

My smartest friends have bananas arguments about LLM coding.

Show article

A psychedelic landscape. — Image by
Annie Ruygt

A heartfelt provocation about AI-assisted programming.

Tech execs are mandating LLM adoption. That’s bad strategy. But I get where they’re coming from.

Some of the smartest people I know share a bone-deep belief that AI is a fad — the next iteration of NFT mania. I’ve been reluctant to push back on them, because, well, they’re smarter than me. But their arguments are unserious, and worth confronting. Extraordinarily talented people are doing work that LLMs already do better, out of spite.

All progress on LLMs could halt today, and LLMs would remain the 2nd most important thing to happen over the course of my career.

: I’m discussing only the implications of LLMs for software development. For art, music, and writing? I got nothing. I’m inclined to believe the skeptics in those fields. I just don’t believe them about mine.

Bona fides: I’ve been shipping software since the mid-1990s. I started out in boxed, shrink-wrap C code. Survived an ill-advised Alexandrescu C++ phase. Lots of Ruby and Python tooling. Some kernel work. A whole lot of server-side C, Go, and Rust. However you define “serious developer”, I qualify. Even if only on one of your lower tiers.

First, we need to get on the same page. If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing.

People coding with LLMs today use agents. Agents get to poke around your codebase on their own. They author files directly. They run tools. They compile code, run tests, and iterate on the results. They also:

pull in arbitrary code from the tree, or from other trees online, into their context windows,
run standard Unix tools to navigate the tree and extract information,
interact with Git,
run existing tooling, like linters, formatters, and model checkers, and
make essentially arbitrary tool calls (that you set up) through MCP.

The code in an agent that actually “does stuff” with code is not, itself, AI. This should reassure you. It’s surprisingly simple systems code, wired to ground truth about programming in the same way a Makefile is. You could write an effective coding agent in a weekend. Its strengths would have more to do with how you think about and structure builds and linting and test harnesses than with how advanced o3 or Sonnet have become.

If you’re making requests on a ChatGPT page and then pasting the resulting (broken) code into your editor, you’re not doing what the AI boosters are doing. No wonder you’re talking past each other.

LLMs can write a large fraction of all the tedious code you’ll ever need to write. And most code on most projects is tedious. LLMs drastically reduce the number of things you’ll ever need to Google. They look things up themselves. Most importantly, they don’t get tired; they’re immune to inertia.

Think of anything you wanted to build but didn’t. You tried to home in on some first steps. If you’d been in the limerent phase of a new programming language, you’d have started writing. But you weren’t, so you put it off, for a day, a year, or your whole career.

I can feel my blood pressure rising thinking of all the bookkeeping and Googling and dependency drama of a new project. An LLM can be instructed to just figure all that shit out. Often, it will drop you precisely at that golden moment where shit almost works, and development means tweaking code and immediately seeing things work better. That dopamine hit is why I code.

There’s a downside. Sometimes, gnarly stuff needs doing. But you don’t wanna do it. So you refactor unit tests, soothing yourself with the lie that you’re doing real work. But an LLM can be told to go refactor all your unit tests. An agent can occupy itself for hours putzing with your tests in a VM and come back later with a PR. If you listen to me, you’ll know that. You’ll feel worse yak-shaving. You’ll end up doing… real work.

Are you a vibe coding Youtuber? Can you not read code? If so: astute point. Otherwise: what the fuck is wrong with you?

You’ve always been responsible for what you merge to main. You were five years go. And you are tomorrow, whether or not you use an LLM.

If you build something with an LLM that people will depend on, read the code. In fact, you’ll probably do more than that. You’ll spend 5-10 minutes knocking it back into your own style. LLMs are showing signs of adapting to local idiom, but we’re not there yet.

People complain about LLM-generated code being “probabilistic”. No it isn’t. It’s code. It’s not Yacc output. It’s knowable. The LLM might be stochastic. But the LLM doesn’t matter. What matters is whether you can make sense of the result, and whether your guardrails hold.

Reading other people’s code is part of the job. If you can’t metabolize the boring, repetitive code an LLM generates: skills issue! How are you handling the chaos human developers turn out on a deadline?

For the last month or so, Gemini 2.5 has been my go-to †. Almost nothing it spits out for me merges without edits. I’m sure there’s a skill to getting a SOTA model to one-shot a feature-plus-merge! But I don’t care. I like moving the code around and chuckling to myself while I delete all the stupid comments. I have to read the code line-by-line anyways.

If hallucination matters to you, your programming language has let you down.

Agents lint. They compile and run tests. If their LLM invents a new function signature, the agent sees the error. They feed it back to the LLM, which says “oh, right, I totally made that up” and then tries again.

You’ll only notice this happening if you watch the chain of thought log your agent generates. Don’t. This is why I like Zed’s agent mode: it begs you to tab away and let it work, and pings you with a desktop notification when it’s done.

I’m sure there are still environments where hallucination matters. But “hallucination” is the first thing developers bring up when someone suggests using LLMs, despite it being (more or less) a solved problem.

Does an intern cost $20/month? Because that’s what Cursor.ai costs.

Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.

Maybe the current confusion is about who’s doing what work. Today, LLMs do a lot of typing, Googling, test cases †, and edit-compile-test-debug cycles. But even the most Claude-poisoned serious developers in the world still own curation, judgement, guidance, and direction.

Also: let’s stop kidding ourselves about how good our human first cuts really are.

It’s hard to get a good toolchain for Brainfuck, too. Life’s tough in the aluminum siding business.

A lot of LLM skepticism probably isn’t really about LLMs. It’s projection. People say “LLMs can’t code” when what they really mean is “LLMs can’t write Rust”. Fair enough! But people select languages in part based on how well LLMs work with them, so Rust people should get on that †.

I work mostly in Go. I’m confident the designers of the Go programming language didn’t set out to produce the most LLM-legible language in the industry. They succeeded nonetheless Go has just enough type safety, an extensive standard library, and a culture that prizes (often repetitive) idiom. LLMs kick ass generating it.

All this is to say: I write some Rust. I like it fine. If LLMs and Rust aren’t working for you, I feel you. But if that’s your whole thing, we’re not having the same argument.

Do you like fine Japanese woodworking? All hand tools and sashimono joinery? Me too. Do it on your own time.

I have a basic wood shop in my basement †. I could get a lot of satisfaction from building a table. And, if that table is a workbench or a grill table, sure, I’ll build it. But if I need, like, a table? For people to sit at? In my office? I buy a fucking table.

Professional software developers are in the business of solving practical problems for people with code. We are not, in our day jobs, artisans. Steve Jobs was wrong: we do not need to carve the unseen feet in the sculpture. Nobody cares if the logic board traces are pleasingly routed. If anything we build endures, it won’t be because the codebase was beautiful.

Besides, that’s not really what happens. If you’re taking time carefully golfing functions down into graceful, fluent, minimal functional expressions, alarm bells should ring. You’re yak-shaving. The real work has depleted your focus. You’re not building: you’re self-soothing.

Which, wait for it, is something LLMs are good for. They devour schlep, and clear a path to the important stuff, where your judgement and values really matter.

As a mid-late career coder, I’ve come to appreciate mediocrity. You should be so lucky as to have it flowing almost effortlessly from a tap.

We all write mediocre code. Mediocre code: often fine. Not all code is equally important. Some code should be mediocre. Maximum effort on a random unit test? You’re doing something wrong. Your team lead should correct you.

Developers all love to preen about code. They worry LLMs lower the “ceiling” for quality. Maybe. But they also raise the “floor”.

Gemini’s floor is higher than my own. My code looks nice. But it’s not as thorough. LLM code is repetitive. But mine includes dumb contortions where I got too clever trying to DRY things up.

And LLMs aren’t mediocre on every axis. They almost certainly have a bigger bag of algorithmic tricks than you do: radix tries, topological sorts, graph reductions, and LDPC codes. Humans romanticize rsync (Andrew Tridgell wrote a paper about it!). To an LLM it might not be that much more interesting than a SQL join.

But I’m getting ahead of myself. It doesn’t matter. If truly mediocre code is all we ever get from LLM, that’s still huge. It’s that much less mediocre code humans have to write.

I don’t give a shit.

Smart practitioners get wound up by the AI/VC hype cycle. I can’t blame them. But it’s not an argument. Things either work or they don’t, no matter what Jensen Huang has to say about it.

So does open source. We used to pay good money for databases.

We’re a field premised on automating other people’s jobs away. “Productivity gains,” say the economists. You get what that means, right? Fewer people doing the same stuff. Talked to a travel agent lately? Or a floor broker? Or a record store clerk? Or a darkroom tech?

When this argument comes up, libertarian-leaning VCs start the chant: lamplighters, creative destruction, new kinds of work. Maybe. But I’m not hypnotized. I have no fucking clue whether we’re going to be better off after LLMs. Things could get a lot worse for us.

LLMs really might displace many software developers. That’s not a high horse we get to ride. Our jobs are just as much in tech’s line of fire as everybody else’s have been for the last 3 decades. We’re not East Coast dockworkers; we won’t stop progress on our own.

Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists in ways that might be hard to appreciate if you don’t work in the arts.

We imagine artists spending their working hours pushing the limits of expression. But the median artist isn’t producing gallery pieces. They produce on brief: turning out competent illustrations and compositions for magazine covers, museum displays, motion graphics, and game assets.

LLMs easily — alarmingly — clear industry quality bars . Gallingly, one of the things they’re best at is churning out just-good-enough facsimiles of human creative work. I have family in visual arts. I can’t talk to them about LLMs. I don’t blame them. They’re probably not wrong.

Meanwhile, software developers spot code fragments seemingly lifted from public repositories on Github and lose their shit. What about the licensing? If you’re a lawyer, I defer. But if you’re a software developer playing this card? Cut me a little slack as I ask you to shove this concern up your ass. No profession has demonstrated more contempt for intellectual property.

The median dev thinks Star Wars and Daft Punk are a public commons. The great cultural project of developers has been opposing any protection that might inconvenience a monetizable media-sharing site. When they fail at policy, they route around it with coercion. They stand up global-scale piracy networks and sneer at anybody who so much as tries to preserve a new-release window for a TV show.

Call any of this out if you want to watch a TED talk about how hard it is to stream The Expanse on LibreWolf. Yeah, we get it. You don’t believe in IPR. Then shut the fuck up about IPR. Reap the whirlwind.

It’s all special pleading anyways. LLMs digest code further than you do. If you don’t believe a typeface designer can stake a moral claim on the terminals and counters of a letterform, you sure as hell can’t be possessive about a red-black tree.

When I started writing a couple days ago, I wrote a section to “level set” to the state of the art of LLM-assisted programming. A bluefish filet has a longer shelf life than an LLM take. In the time it took you to read this, everything changed.

Kids today don’t just use agents; they use asynchronous agents. They wake up, free-associate 13 different things for their LLMs to work on, make coffee, fill out a TPS report, drive to the Mars Cheese Castle, and then check their notifications. They’ve got 13 PRs to review. Three get tossed and re-prompted. Five of them get the same feedback a junior dev gets. And five get merged.

“I’m sipping rocket fuel right now,” a friend tells me. “The folks on my team who aren’t embracing AI? It’s like they’re standing still.” He’s not bullshitting me. He doesn’t work in SFBA. He’s got no reason to lie.

There’s plenty of things I can’t trust an LLM with. No LLM has any of access to prod here. But I’ve been first responder on an incident and fed 4o — not o4-mini, 4o — log transcripts, and watched it in seconds spot LVM metadata corruption issues on a host we’ve been complaining about for months. Am I better than an LLM agent at interrogating OpenSearch logs and Honeycomb traces? No. No, I am not.

To the consternation of many of my friends, I’m not a radical or a futurist. I’m a statist. I believe in the haphazard perseverance of complex systems, of institutions, of reversions to the mean. I write Go and Python code. I’m not a Kool-aid drinker.

But something real is happening. My smartest friends are blowing it off. Maybe I persuade you. Probably I don’t. But we need to be done making space for bad arguments.

And here I rejoin your company. I read Simon Willison, and that’s all I really need. But all day, every day, a sizable chunk of the front page of HN is allocated to LLMs: incremental model updates, startups doing things with LLMs, LLM tutorials, screeds against LLMs. It’s annoying!

But AI is also incredibly — a word I use advisedly — important. It’s getting the same kind of attention that smart phones got in 2008, and not as much as the Internet got. That seems about right.

I think this is going to get clearer over the next year. The cool kid haughtiness about “stochastic parrots” and “vibe coding” can’t survive much more contact with reality. I’m snarking about these people, but I meant what I said: they’re smarter than me. And when they get over this affectation, they’re going to make coding agents profoundly more effective than they are today.

Read the original article

tabletcorry

Karma: 854

@Hacker__News
@hacker._news

Comments

By matthewsinclair 2025-06-036:5840 reply

I think this article is pretty spot on — it articulates something I’ve come to appreciate about LLM-assisted coding over the past few months.

I started out very sceptical. When Claude Code landed, I got completely seduced — borderline addicted, slot machine-style — by what initially felt like a superpower. Then I actually read the code. It was shockingly bad. I swung back hard to my earlier scepticism, probably even more entrenched than before.

Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.

It’s still absolute rubbish if you just let it run wild, which is why I think “vibe coding” is basically just “vibe debt” — because it just doesn’t do what most (possibly uninformed) people think it does.

But if you treat it as a collaborator — more like an idiot savant with a massive brain but no instinct or nous — or better yet, as a mech suit [0] that needs firm control — then something interesting happens.

I’m now at a point where working with Claude Code is not just productive, it actually produces pretty good code, with the right guidance. I’ve got tests, lots of them. I’ve also developed a way of getting Claude to document intent as we go, which helps me, any future human reader, and, crucially, the model itself when revisiting old code.

What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.

Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”? There’s no counterfactual, so who knows? But change was inevitable.

What’s clear is this tech is here now, and complaining about it feels a bit like mourning the loss of punch cards when terminals showed up.

[0]: https://matthewsinclair.com/blog/0178-why-llm-powered-progra...

By wpietri 2025-06-0313:425 reply

One of the things I think is going on here is a sort of stone soup effect. [1]

Core to Ptacek's point is that everything has changed in the last 6 months. As you and I presume he agree, the use of off-the-shelf LLMs in code was kinda garbage. And I expect the skepticism he's knocking here ("stochastic parrots") was in fact accurate then.

But it did get a lot of people (and money) to rush in and start trying to make something useful. Like the stone soup story, a lot of other technology has been added to the pot, and now we're moving in the direction of something solid, a proper meal. But given the excitement and investment, it'll be at least a few years before things stabilize. Only at that point can we be sure about how much the stone really added to the soup.

Another counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say.

So I'm still skeptical of the hype. After all, the hype is basically the same as 6 months ago, even though now the boosters can admit the products of 6 months ago sucked. But I can believe we're in the middle of a revolution of developer tooling. Even so, I'm content to wait. We don't know the long term effects on a code base. We don't know what these tools will look like in 6 months. I'm happy to check in again then, where I fully expect to be again told: "If you were trying and failing to use an LLM for code 6 months ago †, you’re not doing what most serious LLM-assisted coders are doing." At least until then, I'm renewing my membership in the Boring Technology Club: https://boringtechnology.club/

[1] https://en.wikipedia.org/wiki/Stone_Soup

By keeda 2025-06-0316:064 reply

> Core to Ptacek's point is that everything has changed in the last 6 months.

This was actually the only point in the essay with which I disagree, and it weakens the overall argument. Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you.

I wrote this comment elsewhere: https://news.ycombinator.com/item?id=44164846 -- Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions, but if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

By wpietri 2025-06-0323:513 reply

>if respectable people with no stake in selling AI like @tptacek or @kentonv in the other AI thread are saying similar things, you should probably take a closer look.

Maybe? Social proof doesn't mean much to me during a hype cycle. You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in. People are extremely good at fooling themselves. There are a lot of extremely smart people following all of the world's major religions, for example, and they can't all be right. And whatever else is going on here, there are a lot of very talented people whose fortunes and futures depend on convincing everybody that something extraordinary is happening here.

I'm glad you have found something that works for you. But I talk with a lot of people who are totally convinced they've found something that makes a huge difference, from essential oils to functional programming. Maybe it does for them. But personally, what works for me is waiting out the hype cycle until we get to the plateau of productivity. Those months that you spent figuring out what worked are months I'd rather spend on using what I've already found to work.

By tptacek 2025-06-040:092 reply

The problem with this argument is that if I'm right, the hype cycle will continue for a long time before it settles (because this is a particularly big problem to have made a dent in), and for that entire span of time skepticism will have been the wrong position.

By wpietri 2025-06-052:001 reply

I think it depends a lot on what you think "wrong position" means. I think skepticism only really goes wrong when it refuses to see the truth in what it's questioning long past the point where it's reasonable. I don't think we're there yet. For example, questions like "What is the long term effect on a code base" take us seeing the long term. Or there are legitimate questions about the ROI of learning and re-learn rapidly changing tools. What's worth it to you may not be in other situations.

I also think hype cycles and actual progress can have a variety of relationships. After Bubble 1.0 burst, there were years of exciting progress without a lot of hype. Maybe we'll get something similar here, as reasonable observers are already seeing the hype cycle falter. E.g.: https://www.economist.com/business/2025/05/21/welcome-to-the...

And of course, it all hinges on you being right. Which I get you are convinced of, but if you want to be thorough, you have to look at the other side of it.

By tptacek 2025-06-052:261 reply

Well, two things. First, I spent a long time being wrong about this; I definitely looked at the other side. Second, the thing I'm convinced of is kind of objective? Like: these things build working code that clears quality thresholds.

But none of that really matters; I'm not so much engaging on the question of whether you are sold on LLM coding (come over next weekend though for the grilling thing we're doing and make your case then!). The only thing I'm engaging on here is the distinction between the hype cycle, which is bad and will get worse over the coming years, and the utility of the tools.

By wpietri 2025-06-0512:46

Thanks! If I can make it I will. (The pinball museum project is sucking up a lot of my time as we get toward launch. You should come by!)

I think that is one interesting question that I'll want to answer before adoption on my projects, but it definitely isn't the only one.

And maybe the hype cycle will get worse and maybe it won't. Like The Economist, I'm starting to see a turn. The amount of money going into LLMs generally is unsustainable, and I think OpenAI's recent raise is a good example: round 11, $40 billion dollar goal, which they're taking in tranches. Already the largest funding round in history, and it's not the last one they'll need before they're in the black. I could easily see a trough of disillusionment coming in the next 18 months. I agree programming tools could well have a lot of innovation over the next few years, but if that happens against a backdrop of "AI" disillusionment, it'll be a lot easier to see what they're actually delivering.

By mplanchard 2025-06-0413:361 reply

So? The better these tools get, the easier they will be to get value out of. It seems not unwise to let them stabilize before investing the effort and getting the value out, especially if you’re working in one of the areas/languages where they’re still not as useful.

Learning how to use a tool once is easy, relearning how to use a tool every six months because of the rapid pace of change is a pain.

By tptacek 2025-06-0415:281 reply

This isn't responsive to what I wrote. Letting the tools stabilize is one thing, makes perfect sense. "Waiting until the hype cycle dies" is another.

By mplanchard 2025-06-0417:001 reply

I suspect the hype cycle and the stabilization curves are relatively in-sync. While the tools are constantly changing, there's always a fresh source of hype, and a fresh variant of "oh you're just not using the right/newest/best model/agent/etc." from those on the hype train.

By tptacek 2025-06-0417:261 reply

This is the thing. I do not agree with that, at all. We can just disagree, and that's fine, but let's be clear about what we're disagreeing about, because the whole goddam point of this piece is that nobody in this "debate" is saying the same thing. I think the hype is going to scale out practically indefinitely, because this stuff actually works spookily well. The hype will remain irrational longer than you can remain solvent.

By mplanchard 2025-06-0418:382 reply

Well, generally, that’s just not how hype works.

A thing being great doesn’t mean it’s going to generate outsized levels of hype forever. Nobody gets hyped about “The Internet” anymore, because novel use cases aren’t being discovered at a rapid clip, and it has well and throughly integrated into the general milieu of society. Same with GPS, vaccines, docker containers, Rust, etc., but I mentioned the Internet first since it’s probably on a similar level of societal shift as is AI in the maximalist version of AI hype.

Once a thing becomes widespread and standardized, it becomes just another part of the world we live in, regardless of how incredible it is. It’s only exciting to be a hype man when you’ve got the weight of broad non-adoption to rail against.

Which brings me to the point I was originally trying to make, with a more well-defined set of terms: who cares if someone waits until the tooling is more widely adopted, easy to use, and somewhat standardized prior to jumping on the bandwagon? Not everyone needs to undergo the pain of being an early adopter, and if the tools become as good as everyone says they will, they will succeed on their merits, and not due to strident hype pieces.

I think some of the frustration the AI camp is dealing with right now is because y’all are the new Rust Evangelism Strike Force, just instead of “you’re a bad software engineer if you use a memory unsafe languages,” it’s “you’re a bad software engineer if you don’t use AI.”

By scott_s 2025-06-0420:30

The tools are at the point now that ignoring them is akin to ignoring Stack Overflow posts. Basically any time you'd google for the answer to something, you might as well ask an AI assistant. It has a good chance of giving you a good answer. And given how programming works, it's usually easy to verify the information. Just like, say, you would do with a Stack Overflow post.

By tptacek 2025-06-0418:51

Who you calling y'all? I'm a developer who was skeptical about AI until about 6 months ago, and then used it, and am now here to say "this shit works". That's all. I write Go, not Rust.

People have all these feelings about AI hype, and they just have nothing at all to do with what I'm saying. How well the tools work have not much at all to do with the hype level. Usually when someone says that, they mean "the tools don't really work". Not this time.

By antifa 2025-06-0414:002 reply

> You could say the same thing about tulip bulbs or any other famous bubble. Lots of smart people with no stake get sucked in.

While I agree with the skepticism, what specifically is the stake here? Most code assists have usable plans in the $10-$20 range. The investors are apparently taking a much bigger risk than the consumer would be in a case like this.

Aside from the horror stories about people spending $100 in one day of API tokens for at best meh results, of course.

By b3morales 2025-06-0821:09

The stakes of changing the way so many people work can't be seen in a short term. Could be good or bad. Probably it will be both, in different ways. Margarine instead of butter seemed like a good idea until we noticed that hydrogenation was worse (in some ways) than the cholesterol problem we were trying to fight.

AI company execs also pretty clearly have a politico-economic idea that they are advancing. The tools may stand on their own but what is the broader effect of supporting them?

By wpietri 2025-06-052:03

The stake they and I were referring to is a financial interest in the success of AI. Related is the reputational impact, of course. A lot of people who may not make money do like being seen as smart and cutting edge.

But even if we look at your notion of stake, you're missing huge chunks of it. Code bases are extremely expensive assets, and programmers are extremely expensive resources. $10 a month is nothing compared to the costs of a major cleanup or rewrite.

By kentonv 2025-06-040:292 reply

Dude. Claude Code has zero learning curve. You just open the terminal app in your code directory and you tell it what you want, in English. In the time you have spent writing these comments about how you don't care to try it now because it's probably just hype, you could have actually tried it and found out if it's just hype.

By lolinder 2025-06-0413:071 reply

I've tried Claude Code repeatedly and haven't figured out how to make it work for me on my work code base. It regularly gets lost, spins out of control, and spends a bunch of tokens without solving anything. I totally sympathize with people who find Claude Code to have a learning curve, and I'm writing this while waiting for Cursor to finish a task I gave it, so it's not like I'm unfamiliar with the tooling in general.

One big problem with Claude Code vs Cursor is that you have to pay for the cost of getting over the learning curve. With Cursor I could eat the subscription fee and then goof off for a long time trying to figure out how to prompt it well. With Claude Code a bad prompt can easily cost me $5 a pop, which (irrationally, but measurably) hurts more than the one-time monthly fee for Cursor.

By kentonv 2025-06-0414:24

Claude Code actually has a flat-rate subscription option now, if you prefer that. Personally I've found the API cost to be pretty negligible, but maybe I'm out of touch. (I mean, it's one AI-generated commit, Michael. What could it cost, $5?)

Anyway, if you've tried it and it doesn't work for you, fair enough. I'm not going to tell you you're wrong. I'm just bothered by all the people who are out here posting about AI being bad while refusing to actually try it. (To be fair, I was one of them, six months ago...)

By wpietri 2025-06-052:041 reply

I could not have, because my standards involve more than a five minute impression from a tool designed to wow people in the first five minutes. Dude.

By kentonv 2025-06-0513:51

I think you're rationalizing your resistance to change. I've been there!

I have no reason to care whether you use AI or not. I'm giving you this advice just for your sake: Consider whether you are taking a big career risk by avoiding learning about the latest tools of your profession.

By potatolicious 2025-06-0317:391 reply

> "Even 2 years ago, before agents or reasoning models, these LLMs were extremely powerful. The catch was, you needed to figure out what worked for you."

Sure, but I would argue that the UX is the product, and that has radically improved in the past 6-12 months.

Yes, you could have produced similar results before, manually prompting the model each time, copy and pasting code, re-prompting the model as needed. I would strenuously argue that the structuring and automation of these tasks is what has made these models broadly usable and powerful.

In the same way that Apple didn't event mobile phones nor touchscreens nor OSes, but the specific combination of these things resulted in a product that was different in kind than what came before, and took over the world.

Likewise, the "putting the LLM into a structured box of validation and automated re-prompting" is huge! It changed the product radically, even if its constituent pieces existed already.

[edit] More generally I would argue that 95% of the useful applications of LLMs aren't about advancing the SOTA model capabilities and more about what kind of structured interaction environment we shove them into.

By keeda 2025-06-0319:02

For sure! I mainly meant to say that people should not attribute the "6 more months until it's really good" point as just another symptom of unfounded hype. It may have taken effort to effectively use AI earlier, which somewhat justified the caution, but now it's significantly easier and caution is counter-productive.

But I think my other point still stands: people will need to figure out for themselves how to fully exploit this technology. What worked for me, for instance, was structuring my code to be essentially functional in nature. This allows for tightly focused contexts which drastically reduces error rates. This is probably orthogonal to the better UX of current AI tooling. Unfortunately, the vast majority of existing code is not functional, and people will have to figure out how to make AI work with that.

A lot of that likely plays into your point about the work required to make useful LLM-based applications. To expand a bit more:

* AI is technology that behaves like people. This makes it confusing to reason about and work with. Products will need to solve for this cognitive dissonance to be successful, which will entail a combination of UX and guardrails.

* Context still seems to be king. My (possibly outdated) experience has been the "right" context trumps larger context windows. With code, for instance, this probably entails standard techniques like static analysis to find relevant bits of code, which some tools have been attempting. For data, this might require eliminating overfetching.

* Data engineering will be critical. Not only does it need to be very clean for good results, giving models unfettered access to the data needs the right access controls which, despite regulations like GDPR, are largely non-existent.

* Security in general will need to be upleveled everywhere. Not only can models be tricked, they can trick you into getting compromised, and so there need to even more guardrails.

A lot of these are regular engineering work that is being done even today. Only it often isn't prioritized because there are always higher priorities... like increasing shareholder value ;-) But if folks want to leverage the capabilities of AI in their businesses, they'll have to solve all these problems for themselves. This is a ton of work. Good thing we have AI to help out!

By gopher_space 2025-06-0321:13

I don't think it's possible to understand what people mean by force multiplier re AI until you use it to teach yourself a new domain and then build something with that knowledge.

Building a mental model of a new domain by creating a logical model that interfaces with a domain I'm familiar with lets me test my assumptions and understanding in real time. I can apply previous experience by analogy and verify usefulness/accuracy instantly.

> Upshot: It took me months to figure out what worked for me, but AI enabled me to produce innovative (probably cutting edge) work in domains I had little prior background in. Yes, the hype should trigger your suspicions[...]

Part of the hype problem is that describing my experience sounds like bullshit to anyone who hasn't gone through the same process. The rate that I pick up concepts well enough to do verifiable work with them is literally unbelievable.

By mwarkentin 2025-06-0411:22

AI posts (including this one) are all over his employers blog lately, so there’s some stake (fly MCP, https://fly.io/blog/fuckin-robots/, etc).

By xpe 2025-06-0314:221 reply

Almost by definition, one should be skeptical about hype. So we’re all trying to sort out what is being sold to us.

Different people have different weird tendencies in different directions. Some people irrationally assume that things aren’t going to change much. Others see a trend and irrationally assume that it will continue on a trend line.

Synthesis is hard.

Understanding causality is even harder.

Savvy people know that we’re just operating with a bag of models and trying to choose the right combination for the right situation.

This misunderstanding is one reason why doomers, accelerations, and “normies” talk past each other or (worse) look down on each other. (I’m not trying to claim epistemic equivalence here; some perspectives are based on better information, some are better calibrated than others! I’m just not laying out my personal claims at this point. Instead, I’m focusing on how we talk to each other.)

Another big source of misunderstanding is about differing loci of control. People in positions of influence are naturally inclined to think about what they can do, who they know, and where they want to be. People farther removed feel relatively powerless and tend to hold onto their notions of stability, such as the status quo or their deepest values.

Historically, programmers have been quite willing to learn new technologies, but now we’re seeing widespread examples where people’s plasticity has limits. Many developers cannot (or are unwilling to) wrap their minds around the changing world. So instead of confronting the reality they find ways to deny it, consciously or subconsciously. Our perception itself is shaped by our beliefs, and some people won’t even perceive the threat because it is too strange or disconcerting. Such is human nature: we all do it. Sometimes we’re lucky enough to admit it.

By wpietri 2025-06-0323:582 reply

I think "the reality", at least as something involving a new paradigm, has yet to be established. I'll note that I heard plenty of similar talk about how developers just couldn't adapt six months or more ago. Promoters now can admit those tools were in fact pretty bad, because they now have something else to promote, but at the time those not rawdogging LLMs were dinosaurs under a big meteor.

I do of course agree that some people are just refusing to "wrap their minds around the changing world". But anybody with enough experience in tech can count a lot more instances of "the world is about to change" than "the world really changed". The most recent obvious example being cryptocurrencies, but there are plenty of others. [1] So I think there's plenty of room here for legitimate skepticism. And for just waiting until things settle down to see where we ended up.

[1] E.g. https://www.youtube.com/watch?v=b2F-DItXtZs

By xpe 2025-06-0413:101 reply

Fair points.

Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions. That is selection bias. Many predicted disruptive changes do occur.

Most importantly, if one wants to be intellectually honest, one has to engage against a set of plausible arguments and scenarios. Debunking one particular company’s hyperbolic vision for the future might be easy, but it probably doesn’t generalize.

It is telling to see how many predictions can seem obvious in retrospect from the right frame of reference. In a sense (or more than that under certain views of physics), the future already exists, the patterns already exist. We just have to find the patterns — find the lens or model that will help the messy world make sense to us.

I do my best to put the hype to the side. I try to pay attention to the fundamentals such as scaling laws, performance over time, etc while noting how people keep moving the goalposts.

Also wrt the cognitive bias aspect: Cryptocurrencies didn’t threaten to apply significant (if any) downward pressure on the software development labor market.

Also, even cryptocurrency proponents knew deep down that it was a chicken and the egg problem: boosters might have said adoption was happening and maybe even inevitable, but the assumption was right out there in the open. It also had the warning signs of obvious financial fraud, money laundering, currency speculation, and ponzi scheming.

Adoption of artificial intelligence is different in many notable ways. Most saliently, it is not a chicken and egg problem: it does not require collective action. Anyone who does it well has a competitive advantage. It is a race.

(Like Max Tegmark and others, I view racing towards superintelligence as a suicide race, not an arms race. This is a predictive claim that can be debated by assessing scenarios, understanding human nature, and assigning probabilities.)

By wpietri 2025-06-052:081 reply

> Generally speaking, I find it suspect when someone points to failed predictions of disruptive changes without acknowledging successful predictions.

I specifically said: "But anybody with enough experience in tech can count a lot more instances of 'the world is about to change' than 'the world really changed'. I pretty clearly understand that sometimes the world does change.

Funnily, I find it suspect when people accuse me of failing to do things I did in the very post they're responding to. So I think this is a fine time for us both to find better ways to spend our time.

By xpe 2025-06-052:17

Sorry, I can see why you might take that the wrong way. In my defense, I consciously wrote "generally speaking" in the hopes you wouldn't think I was referring to you in particular. I wasn't trying to accuse you of anything.

I strive to not criticize people indirectly: my style is usually closer to say New York than San Francisco. If I disagree with something in particular, I try to make that clear without beating around the bush.

By deanCommie 2025-06-054:42

> Promoters now can admit those tools were in fact pretty bad

Relative to what came after, which noone could predict would be guaranteed?

The Model T was in fact pretty bad relative to what came after...

> because they now have something else to promote

something else which is better?

i don't understand the inherent cynicism here.

By spaceman_2020 2025-06-0316:591 reply

I’m an amateur coder and I used to rely on Cursor a lot to code when I was actively working on hobby apps about 6 months ago

I picked coding again a couple of days back and I’m blown away by how much things have changed

It was all manual work until a few months back. Suddenly, its all agents

By wpietri 2025-06-0323:592 reply

> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

By spaceman_2020 2025-06-047:201 reply

My 80-year old dad tells me that when he bought his first car, he could pop open the hood and fiddle with things and maybe get it to work after a breakdown

Now he can't - it's too closed and complicated

Yet, modern cars are way better and almost never breakdown

Don't see how LLMs are any different than any other tech advancement that obfuscates and abstracts the "fundamentals".

By wpietri 2025-06-052:09

I definitely believe that you don't see it. We just disagree on what that implies.

By wpietri 2025-06-052:09

Oops, this was a reply to somebody else put in the wrong place. Sorry for the confusion.

By DannyBee 2025-06-0320:501 reply

"nother counterfactual that we'll never know is what kinds of tooling we would have gotten if people had dumped a few billion dollars into code tool improvement without LLMs, but with, say, a lot of more conventional ML tooling. Would the tools we get be much better? Much worse? About the same but different in strengths and weaknesses? Impossible to say."

You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I wish i could impress this upon more people.

A friend similarly used to lament/complain that Kotlin sucked in part because we could have probably accomplished it's major features in Java, and maybe without tons of work, or migration cost.

This is maybe even true!

as an intellectual exercise, both are interesting to think about. But outside of that, people get caught up in this as if it matters, but it doesn't.

Basically nothing is driven by pure technical merit alone, not just in CS, but in any field. So my point to him was the lesson to take away from this is not "we could have been more effective or done it cheaper or whatever" but "my definition of effectiveness doesn't match how reality decides effectiveness, so i should adjust my definition".

As much as people want the definition to be a meritocracy, it just isn't and honestly, seems unlikely to ever be.

So while it's 100% true that billions of dollars dumped into other tools or approaches or whatever may have have generated good, better, maybe even amazing results, they weren't, and more importantly, never would have been. Unknown but maybe infinite ROI is often much more likely to see investment than more known but maybe only 2x ROI.

and like i said, this is not just true in CS, but in lots of fields.

That is arguably quite bad, but also seems unlikely to change.

By wpietri 2025-06-052:101 reply

> You'll not only never know this, it's IMHO not very useful to think about at all, except as an intellectual exercise.

I think it's very useful if one wants to properly weigh the value of LLMs in a way that gets beyond the hype. Which I do.

By DannyBee 2025-06-0513:23

Sure, and that works in the abstract (ie "what investment would theoretically have made the most sense") but if you are trying to compare in the real world you have to be careful because it assumes the alternative would have ever happened. I doubt it would have.

By raxxorraxor 2025-06-039:294 reply

The better I am at solving a problem, the less I use AI assistants. I use them if I try a new language or framework.

Busy code I need to generate is difficult to do with AI too. Because then you need to formalize the necessary context for an AI assistant, which is exhausting with an unsure result. So perhaps it is just simpler to write it yourself quickly.

I understand comments being negative, because there is so much AI hype without having to many practical applications yet. Or at least good practical applications. Some of that hype is justified, some of it is not. I enjoyed the image/video/audio synthesis hype more tbh.

Test cases are quite helpful and comments are decent too. But often prompting is more complex than programming something. And you can never be sure if any answer is usable.

By Cthulhu_ 2025-06-0312:194 reply

> But often prompting is more complex than programming something.

I'd challenge this one; is it more complex, or is all the thinking and decision making concentrated into a single sentence or paragraph? For me, programming something is taking a big high over problem and breaking it down into smaller and smaller sections until it's a line of code; the lines of code are relatively low effort / cost little brain power. But in my experience, the problem itself and its nuances are only defined once all code is written. If you have to prompt an AI to write it, you need to define the problem beforehand.

It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source. Techniques like TDD have shifted more of the problem definition forwards as you have to think about your desired outcomes before writing code, but I'm pretty sure (I have no figures) it's only a minority of developers that have the self-discipline to practice test-driven development consistently.

(disclaimer: I don't use AI much, and my employer isn't yet looking into or paying for agentic coding, so it's chat style or inline code suggestions)

By sksisoakanan 2025-06-0313:581 reply

The issue with prompting is English (or any other human language) is nowhere near as rigid or strict a language as a programming language. Almost always an idea can be expressed much more succinctly in code than language.

Combine that with when you’re reading the code it’s often much easier to develop a prototype solution as you go and you end up with prompting feeling like using 4 men to carry a wheelbarrow instead of having 1 push it.

By michaelfeathers 2025-06-0316:271 reply

I think we are going to end up with common design/code specification language that we use for prompting and testing. There's always going to be a need to convey the exact semantics of what we want. If not, for AI then for the humans who have to grapple with what is made.

By rerdavies 2025-06-0317:441 reply

Sounds like "Heavy process". "Specifying exact semantics" has been tried and ended up unimaginably badly.

By bcrosby95 2025-06-0318:212 reply

Nah, imagine a programming language optimized for creating specifications.

Feed it to an llm and it implements it. Ideally it can also verify it's solution with your specification code. If LLMs don't gain significantly more general capabilities I could see this happening in the longer term. But it's too early to say.

In a sense the llm turns into a compiler.

By rerdavies 2025-06-0416:111 reply

It's an interesting idea. I get it. Although I wonder.... do you really need formal languages anymore now that we have LLMs that can take natural language specifications as input.

I tried running the idea on a programming task I did yesterday. "Create a dialog to edit the contents of THIS data structure." It did actually produce a dialog that worked the first time. Admitedly a very ugly dialog. But all the fields and labels and controls were there in the right order with the right labels, and were all properly bound to props of a react control, that was grudgingly fit for purpose. I suspect I could have corrected some of the layout issues with supplementary prompts. But it worked. I will do it again, with supplementary prompts next time.

Anyway. I next thought about how I would specify the behavior I wanted. The informal specification would be "Open the Looping dialog. Set Start to 1:00, then open the Timebase dialog. Select "Beats", set the tempo to 120, and press the back button. Verify that the Start text edit now contains "30:1" (the same time expressed in bars and beats). Set it to 10:1,press the back button, and verify that the corresponding "Loop" <description of storage for that data omited for clarity> for the currently selected plugin contains 20.0. I can actually see that working (and I plan to see if I can convince an AI to turn that into test code for me).

Any imaginable formal specification for that would be just grim. In fact, I can't imagine a "formal" specification for that. But a natural language specification seems eminently doable. And even if there were such a formal specification, I am 100% positive that I would be using natural language AI prompts to generate the specifications. Which makes me wonder why anyone needs a formal language for that.

And I can't help thinking that "Write test code for the specifications given in the previous prompt" is something I need to try. How to give my AI tooling to get access to UI controls though....

By michaelfeathers 2025-06-0816:17

That doesn't sound like the sort of problem you'd use it for. I think it would be used for the ~10% of code you have in some applications that are part of the critical core. UI, not so much.

By cess11 2025-06-0318:401 reply

We've had that for a long, long time. Notably RAD-tooling running on XML.

The main lesson has been that it's actually not much of an enabler and the people doing it end up being specialised and rather expensive consultants.

By CamperBob2 2025-06-0319:591 reply

RAD before transformers was like trying to build an iPhone before capacitive multitouch: a total waste of time.

Things are different now.

By cess11 2025-06-0320:051 reply

I'm not so sure. What can you show me that you think would be convincing?

By CamperBob2 2025-06-0320:312 reply

I think there are enough examples of genuine AI-facilitated rapid application development out there already, honestly. I wouldn't have anything to add to the pile, since I'm not a RAD kind of guy.

Disillusionment seems to spring from expecting the model to be a god or a genie instead of a code generator. Some people are always going to be better at using tools than other people are. I don't see that changing, even though the tools themselves are changing radically.

By cess11 2025-06-047:08

"Nothing" would have been shorter and more convenient for us both.

By soraminazuki 2025-06-044:561 reply

That's a straw man. Asking for real examples to back up your claims isn't overt perfectionism.

By CamperBob2 2025-06-045:47

If you weren't paying attention to what's been happening for the last couple of years, you certainly won't believe anything I have to say.

Trust me on this, at least: I don't need the typing practice.

By starlust2 2025-06-0317:591 reply

A big challenge is that programmers all have unique ever changing personal style and vision that they've never had to communicate before. As well they generally "bikeshed" and add undefined unrequested requirements, because you know someday we might need to support 10000x more users than we have. This is all well and good when the programmer implements something themselves but falls apart when it must be communicated to an LLM. Most projects/systems/orgs don't have the necessary level of detail in their documentation, documentation is fragmented across git/jira/confluence/etc/etc/etc., and it's a hodge podge of technologies without a semblance of consistency.

I think we'll find that over the next few years the first really big win will be AI tearing down the mountain of tech & documentation debt. Bringing efficiency to corporate knowledge is likely a key element to AI working within them.

By mlsu 2025-06-0318:522 reply

Efficiency to corporate knowledge? Absolutely not, no way. My coworkers are beginning to use AI to write PR descriptions and git commits.

I notice, because the amount of text has been increased tenfold while the amount of information has stayed exactly the same.

This is a torrent of shit coming down on us, that we are all going to have to deal with it. The vibe coders will be gleefully putting up PRs with 12 paragraphs of "descriptive" text. Thanks no thanks!

By starlust2 2025-06-0415:13

Well I'm certainly not saying that AI should generate more corporate spam. That's part of them problem! And also a strawman argument!

By aloisdg 2025-06-0323:00

Use a llm to summarize the PR /j

By bcrosby95 2025-06-0318:191 reply

I design and think upfront but I don't write it down until I start coding. I can do this for pretty large chunks of code at once.

The fastest way I can transcribe a design is with code or pseudocode. Converting it into English can be hard.

It reminds me a bit of the discussion of if you have an inner monologue. I don't and turning thoughts into English takes work, especially if you need to be specific with what you want.

By averageRoyalty 2025-06-0322:31

I also don't have an inner monologue and can relate somewhat. However I find that natural language (usually) allows me to be more expressive than pseudocode in the same period of time.

There's also an intangible benefit of having someone to "bounce off". If I'm using an LLM, I am tweaking the system prompt to slow it down, make it ask questions and bug me before making changes. Even without that, writing out the idea displays quickly potential logic or approach flaws - much fast than writing pseudo in my experience.

By algorithmsRcool 2025-06-0316:101 reply

> It's more design and more thinking upfront, which is something the development community has moved away from in the past ~20 years with the rise of agile development and open source.

I agree, but even smaller than thinking in agile is just a tight iteration loop when i'm exploring a design. My ADHD makes upfront design a challenge for me and I am personally much more effective starting with a sketch of what needs to be done and then iterating on it until I get a good result.

The loop of prompt->study->prompt->study... is disruptive to my inner loop for several reasons, but a big one is that the machine doesn't "think" like i do. So the solutions it scaffolds commonly make me say "huh?" and i have to change my thought process to interpet them and then study them for mistakes. My intution and iteration is, for the time being, more effective than this machine assited loop for the really "interesting" code i have to write.

But i will say that AI has been a big time saver for more mundane tasks, especially when I can say "use this example and apply it to the rest of this code/abstraction".

By samsepi01 2025-06-088:35

> "The loop of prompt->study->prompt->study... is disruptive to my inner loop for several reasons, but a big one is that the machine doesn't "think" like i do. So the solutions it scaffolds commonly make me say "huh?" and i have to change my thought process to interpet them and then study them for mistakes. My intution and iteration is, for the time being, more effective than this machine assisted loop..."

My thoughts exactly as an ADHD dev.

Was having trouble describing my main issue with LLM-assisted development...

Thank you for giving me the words!

By avemuri 2025-06-0311:423 reply

I agree with your points but I'm also reminded of one my bigger learnings as a manager - the stuff I'm best at is the hardest, but most important, to delegate.

Sure it was easier to do it myself. But putting in the time to train, give context, develop guardrails, learn how to monitor etc ultimately taught me the skills needed to delegate effectively and multiply the teams output massively as we added people.

It's early days but I'm getting the same feeling with LLMs. It's as exhausting as training an overconfident but talented intern, but if you can work through it and somehow get it to produce something as good as you would do yourself, it's a massive multiplier.

By johnmaguire 2025-06-0313:012 reply

I don't totally understand the parallel you're drawing here. As a manager, I assume you're training more junior (in terms of their career or the company) engineers up so they can perform more autonomously in the future.

But you're not training LLMs as you use them really - do you mean that it's best to develop your own skill using LLMs in an area you already understand well?

I'm finding it a bit hard to square your comment about it being exhausting to catherd the LLM with it being a force multiplier.

By avemuri 2025-06-0315:52

No I'm talking about my own skills. How I onboard, structure 1on1s, run meetings, create and reuse certain processes, manage documentation (a form of org memory), check in on status, devise metrics and other indicators of system health. All of these compound and provide leverage even if the person leaves and a new one enters.the 30th person I onboarded and managed was orders of magnitude easier (for both of us) than the first.

With LLMs the better I get at the scaffolding and prompting, the less it feels like catherding (so far at least). Hence the comparison.

By wpietri 2025-06-0313:47

Great point.

Humans really like to anthropomorphize things. Loud rumbles in the clouds? There must be a dude on top of a mountain somewhere who's in charge of it. Impressed by that tree? It must have a spirit that's like our spirits.

I think a lot of the reason LLMs are enjoying such a huge hype wave is that they invite that sort of anthropomorphization. It can be really hard to think about them in terms of what they actually are, because both our head-meat and our culture has so much support for casting things as other people.

By GoblinSlayer 2025-06-0313:162 reply

Do LLMs learn? I had an impression you borrow a pretrained LLM that handles each query starting with the same initial state.

By simonw 2025-06-0313:342 reply

No, LLMs don't learn - each new conversation effectively clears the slate and resets them to their original state.

If you know what you're doing you can still "teach" them though, but it's on you to do that - you need to keep on iterating on things like the system prompt you are using and the context you feed in to the model.

By runarberg 2025-06-0321:10

This sounds like trying to glue on supervised learning post-hoc.

Makes me wonder if there had been equal investment into specialized tools which used more fine-tuned statistical methods (like supervised learning), that we would have something much better then LLMs.

I keep thinking about spell checkers and auto-translators, which have been using machine learning for a while, with pretty impressive results (unless I’m mistaken I think most of those use supervised learning models). I have no doubt we will start seeing companies replacing these proven models with an LLM and a noticeable reduction in quality.

By rerdavies 2025-06-0317:532 reply

That's mostly, but not completely true. There are various strategies to get LLMs to remember previous conversations. ChatGPT, for example, remembers (for some loose definition of "remembers") all previous conversations you've had with it.

By runarberg 2025-06-0321:271 reply

I think if you use a very loose definition of learning: A stimuli which alters subsequent behavior you can claim this is learning. But if you tell a human to replace the word “is” with “are” in the next two sentences, this could hardly be considered learning, rather it is just following commands, even though it meets the previous loose definition. This is why in psychology we usually include some timescale for how long the altered behavior must last for it to be considered learning. A short-term altered behavior is usually called priming. But even then I wouldn’t even consider “following commands” to be neither priming nor learning, I would simply call it obeying.

If an LLM learned something when you gave it commands, it would probably be reflected in some adjusted weights in some of its operational matrix. This is true of human learning, we strengthen some neural connection, and when we receive a similar stimuli in a similar situation sometime in the future, the new stimuli will follow a slightly different path along its neural pathway and result in a altered behavior (or at least have a greater probability of an altered behavior). For an LLM to “learn” I would like to see something similar.

By rerdavies 2025-06-0416:271 reply

I think you have an overly strict definition of what "learning" means. ChatGPT now has memory that lasts beyond the lifetime of it's context buffer, and now has at least medium term memory. (Actually I'm not entirely sure that they are not just using long persistent context buffers, but anyway).

Admittedly, you have to wrap LLMs to with stuff to get them to do that. If you want to rewrite the rules to excluded that then I will have to revise my statement that it is "mostly, but not completely true".

:-P

By runarberg 2025-06-0418:51

You also have to alter some neural pathways in your brain to follow commands. That doesn’t make it learning. Learned behavior is usually (but not always) reflected in long term changes to neural pathways outside of the language centers of the brain, and outside of the short-term memory. Ones you forget the command, and still apply the behavior, that is learning.

I think SSR schedulers are a good example of a Machine Learning algorithms that learns from it’s previous interactions. If you run the optimizer you will end up with a different weight matrix, and flashcards will be schedule differently. It has learned how well you retain these cards. But an LLM that is simply following orders has not learned anything, unless you feed the previous interaction back into the system to alter future outcomes, regardless of whether it “remembers” the original interactions. With the SSR, your review history is completely forgotten about. You could delete it, but the weight matrix keeps the optimized weights. If you delete your chat history with ChatGPT, it will not behave any differently based on the previous interaction.

By simonw 2025-06-0323:34

I'd count ChatGPT memory as a feature of ChatGPT, not of the underlying LLM.

I wrote a bit about that here - I've turned it off: https://simonwillison.net/2025/May/21/chatgpt-new-memory/

By bodegajed 2025-06-0314:05

Yes with few shots. you need to provide at least 2 examples of similar instructions and their corresponding solutions. But when you have to build few shots every time you prompt it feels like you're doing the work already.

Edit: grammar

By conartist6 2025-06-0313:021 reply

But... But... the multiplier isn't NEW!

You just explained how your work was affected by a big multiplier. At the end of training an intern you get a trained intern -- potentially a huge multiplier. ChatGPT is like an intern you can never train and will never get much better.

These are the same people who would no longer create or participate deeply in OSS (+100x multipler) bragging about the +2x multiplier they got in exchange.

By conartist6 2025-06-0313:131 reply

The first person you pass your knowledge onto can pass it onto a second. ChatGPT will not only never build knowledge, it will never turn from the learner to the mentor passing hard-won knowledge on to another learner.

By nrdgrrrl 2025-06-0314:17

[dead]

By brulard 2025-06-0311:351 reply

> But often prompting is more complex than programming something. It may be more complex, but it is in my opinion better long term. We need to get good at communicating with AIs to get results that we want. Forgive me assuming that you probably didn't use these assistants long enough to get good at using them. I'm web developer for 20 years already and AI tools are multiplying my output even in problems I'm very good at. And they are getting better very quickly.

By GoblinSlayer 2025-06-0312:50

Yep, it looks like LLMs are used as fast typists, and coincidentally in webdev typing speed is the most important bottleneck when you need to add cookie consent, spinners, dozens of ad providers, tracking pixels, twitter metadata, google metadata, manual rendering, buttons web components with material design and react, hover panels, fontawesome, recaptcha, and that's only 1% of modern web boilerplate, then it's easy to see how a fast typist can help you.

By echelon 2025-06-0316:092 reply

> The better I am at solving a problem, the less I use AI assistants.

Yes, but you're expensive.

And these models are getting better at solving a lot of business-relevant problems.

Soon all business-relevant problems will be bent to the shape of the LLM because it's cost-effective.

By onemoresoop 2025-06-0319:303 reply

You're forgetting how much money is being burned in keeping these LLMs cheap. Remember when Uber was a fraction of the cost of a cab? Yeah, those days didn't last.

By averageRoyalty 2025-06-0322:37

> Remember when Uber was a fraction of the cost of a cab? Yeah, those days didn't last.

They're still much cheaper where I am. But regardless, why not take the Uber while it's cheaper?

There's the argument of the taxi industry collapsing (it hasn't yet). Is your concern some sort of long term knowledge loss from programmers and a rug pull? There are many good LLM options out there, they're getting cheaper and the knowledge loss wouldn't be impactful (and rug pull-able) for at least a decade or so.

By ido 2025-06-046:501 reply

Even at 100x the cost (currently $20/month for most of these via subscriptions) it’s still cheaper than an intern, let alone a senior dev.

By nipah 2025-06-1114:16

I'm sorry, 2000 USD per month is MUCH more costly than an engineer from a third world country, it can basically pay for a senior where I live. Even 200 USD is sufficient for an intern here. The problem with your point is that it's not counting on the fact that this work can be done all over the world.

By a4isms 2025-06-0321:13

I have been in this industry since the mid 80s. I can't tell you how many people worry that I can't handle change because as a veteran, I must cling to what was. Meanwhile, of course, the reason I am still in the industry is because of my plasticity. Nothing is as it was for me, and I have changed just about everything about how I work multiple times. But what does stay the same all this time are people and businesses and how we/they behave.

Which brings me to your comment. The comparison to Uber drivers is apt, and to use a fashionable word these days, the threat to people and startups alike is "enshittification." These tools are not sold, they are rented. Should a few behemoths gain effective control of the market, we know from history that we won't see these tools become commodities and nearly free, we'll see the users of these tools (again, both people and businesses) squeezed until their margins are paper-thin.

Back when articles by Joel Spolsky regularly hit the top page of Hacker News, he wrote "Strategy Letter V:" https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/

The relevant takeaway was that companies try to commoditize their complements, and for LLM vendors, every startup is a complement. A brick-and-mortar metaphor is that of a retailer in a mall. If you as a retailer are paying more in rent than you're making, you are "working for the landlord," just as if you are making less than 30% of profit on everything you sell or rent through Apple's App Store, you're working for Apple.

I once described that as "Sharecropping in Apple's Orchard," and if I'm hesitant about the direction we're going, it's not anything about clinging to punch cards and ferromagnetic RAM, it's more the worry that it's not just a question of programmers becoming enshittified by their tools, it's also the entire notion of a software business "Sharecropping the LLM vendor's fields."

We spend way too much time talking about programming itself and not enough about whither the software business if its leverage is bound to tools that can only be rented on terms set by vendors.

--------

I don't know for certain where things will go or how we'll get there. I actually like the idea that a solo founder could create a billion-dollar company with no employees in my lifetime. And I have always liked the idea of software being "Wheels for the Mind," and we could be on a path to that, rather than turning humans into "reverse centaurs" that labour for the software rather than the other way around.

Once upon a time, VCs would always ask a startup, "What is your Plan B should you start getting traction and then Microsoft decides to compete with you/commoditize you by giving the same thing away?" That era passed, and Paul Graham celebrated it: https://paulgraham.com/microsoft.html

Then when startups became cheap to launch—thank you increased tech leverage and cheap money and YCombinator industrializing early-stage venture capital—the question became, "What is your moat against three smart kids launching a competitor?"

Now I wonder if the key question will bifurcate:

1. What is your moat against somebody launching competition even more cheaply than smart kids with YCombinator's backing, and;

2. How are you insulated against the cost of load-bearing tooling for everything in your business becoming arbitrarily more expensive?

By soraminazuki 2025-06-048:18

Actually, I agree. It won't be long before businesses handle software engineering like Google does "support." You know, that robotic system that sends out passive-aggressive mocking emails to people who got screwed over by another robot that locks them out of their digital lives for made up reasons [1]. It saves the suits a ton of cash while letting them dodge any responsibility for the inevitable harm it'll cause to society. Mediocrity will be seen as a feature, and the worst part is, the zealots will wave it like a badge of honor.

[1]: https://news.ycombinator.com/item?id=26061935

By fsloth 2025-06-037:052 reply

I totally agree. The ”hard to control mech suit” is an excellent analogy.

When it works it’s brilliant.

There is a threshold point as part of the learning curve where you realize you are in a pile of spaghetti code and think it actually saves no time to use LLM assistant.

But then you learn to avoid the bad parts - thus they don’t take your time anymore - and the good parts start paying back in heaps of the time spent learning.

They are not zero effort tools.

There is a non-trivial learning cost involved.

By teaearlgraycold 2025-06-038:131 reply

The issue is we’re too early in the process to even have a solid education program for using LLMs. I use them all the time and continue to struggle finding an approach that works well. It’s easy to use them for documentation look up. Or filling in boilerplate. Sometimes they nail a transformation/translation task, other times they’re more trouble than they’re worth.

We need to understand what kind of guard rails to put these models on for optimal results.

By fsloth 2025-06-038:342 reply

” we’re too early in the process to even have a solid education program for using LLMs”

We don’t even have a solid education program for software engineering - possibly for the same reason.

The industry loves to run on the bleeding edge, rather than just think for a minute :)

By baq 2025-06-0311:491 reply

when you stop to think, your fifteen (...thousand) competitors will all attempt a different version of the thing you're thinking about and one of them will be the about the thing you'll come up with, except it'll be built.

it might be ok since what you were thinking about is probably not a good idea in the first place for various reasons, but once in a while stars align to produce the unicorn, which you want to be if you're thinking about building something.

caveat: maybe you just want to build in a niche, it's fine to think hard in such places. usually.

By fsloth 2025-06-0311:57

Fwiw a legion of wishfull app developers is not ”the industry”. It’s fine for individuals to move fast.

Institution scale lack of deep thinking is the main issue.

By soraminazuki 2025-06-048:531 reply

> We don’t even have a solid education program for software engineering - possibly for the same reason.

There's an entire field called computer science. ACM provides curricular recommendations that it updates every few years. People spend years learning it. The same can't be said about the field of, prompting.

By fsloth 2025-06-0415:551 reply

But nobody seems to trust any formally specified education, hence practices like whiteboarding as part of job interviews.

How do we know a software engineer is competent? We can’t tell, and damned if we trust that msc he holds.

Computer science, while fundamental, is very little of help in the emergent large scale problems which ”software engineering” tries to tackle.

The key problem is converting capital investment to a working software with given requirements and this is quite unpredictable.

We don’t know how to effectively train software engineers so that software projects would be predictable.

We don’t know how to train software engineers so that employers would trust their degrees as a strong signal of competence.

If there is a university program that, for example FAANGM (or what ever letters forms the pinnacle of markets) companies respect as a clear signal of obvious competence as a software engineer I would like to know what that is.

By soraminazuki 2025-06-0420:351 reply

That says more about the industry than the quality of formal education. After all, it's the very same industry that's hailing mediocre robots as replacements of human software engineers. Even the article has this to say:

> As a mid-late career coder, I’ve come to appreciate mediocrity.

Then there's also the embracement of anti-intellectualism. "But I don't want to spend time learning X!" is a surprisingly common comment on, er, Hacker News.

So yeah, no surprise that formal education is looked down on. Doesn't make it right though.

By nipah 2025-06-1114:23

I don't think two wrongs make a right, tho. Looking down at formal education is not the same as embracing anti-intellectualism. And while I admit the software industry is absolutely full of bullshit anti-intellectualism, I also don't believe formal education is or should be the standard for the education, formalism is not automatically better than alternative means of learning.

The problem of anti-intellectualism in SE is just the consequence of the field being more "democratized". Or, to put in other words, the mass is stupid and the mass-man is stupidier and primitive.

By jes5199 2025-06-0318:01

also, the agents are actually pretty good at cleaning up spaghetti if you do it one module at a time, use unit tests. And some of the models are smart enough to suggest good organization schemes!

By tptacek 2025-06-037:184 reply

For what it's worth: I'm not dismissive of the idea that these things could be ruinous for the interests of the profession. I don't automatically assume that making applications drastically easier to produce is just going to make way for more opportunities.

I just don't think the interest of the profession control. The travel agents had interests too!

By hostyle 2025-06-0314:452 reply

For a long time there has been back chatter on how to turn programming into a more professional field, more like actual engineering where when something goes wrong actual people and companies start to take security seriously, and get held accountable for their mistakes, and start to actually earn their high salaries.

Getting AI to hallucinate its way into secure and better quality code seems like the antithesis of this. Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars - instead of attempting to take on our more critical and creative outputs - image generation, movie generation, book writing and even website building.

By tptacek 2025-06-0319:011 reply

The problem with this argument is that it's not what's going to happen. In the trajectory I see of LLM code generation, security quality between best-practices well-prompted (ie: not creatively well prompted, just people with a decent set of Instructions.md or whatever) and well trained human coders is going to be a wash. Maybe in 5 years SOTA models will clearly exceed human coders on this, but my premise is all progress stops and we just stick with what we have today.

But the analysis doesn't stop there, because after the raw quality wash, we have to consider things LLMs can do profoundly better than human coders can. Codebase instrumentation, static analysis, type system tuning, formal analysis: all things humans can do, spottily, on a good day but that empirically across most codebases they do not do. An LLM can just be told to spend an afternoon doing them.

I'm a security professional before I am anything else (vulnerability research, software security consulting) and my take on LLM codegen is that they're likely to be a profound win for security.

By pxc 2025-06-0323:481 reply

Isn't formal analysis exactly the kind of thing LLMs can't do at all? Or do you mean an LLM invoking a proof assistant or something like that?

By tptacek 2025-06-040:08

Yes, I mean LLMs generating proof specs and invoking assistants, not that they themselves do any formal modeling.

By epiccoleman 2025-06-0319:24

> Why don't we have AI and robots working for humanity with the boring menial tasks - mowing laws, filing taxes, washing dishes, driving cars

I mean, we do have automation for literally all of those things, to varying degrees of effectiveness.

There's an increasing number of little "roomba" style mowers around my neighborhood. I file taxes every year with FreeTaxUSA and while it's still annoying, a lot of menial "form-filling" labor has been taken away from me there. My dishwasher does a better job cleaning my dishes than I would by hand. And though there's been a huge amount of hype-driven BS around 'self-driving', we've undeniably made advances in that direction over the last decade.

By ivape 2025-06-037:415 reply

Soon as the world realized they don't need a website and can just have FB/Twitter page, a huge percentage of freelance web development gigs just vanished. We have to get real about what's about to happen. The app economy filled the gap, and the only optimistic case is the AI app industry is what's going to fill the gap going forward. I just don't know about that. There's a certain end-game vibes I'm getting because we're talking about self-building and self-healing software. More so, a person can ask the AI to role play anything, even an app.

By tptacek 2025-06-037:43

Sure. And before the invention of the spreadsheet, the world's most important programming language, individual spreadsheets were something a programmer had to build for a business.

By Earw0rm 2025-06-0316:021 reply

Except that FB/Twitter are rotting platforms. I don't pretend that freelance web dev is a premium gig, but setting up Wordpress sites for local flower shops etc. shouldn't require a higher level of education/sophistication than e.g. making physical signs for the same shops.

Technical? Yes. Hardcore expert premium technical, no. The people who want the service can pay someone with basic to moderate skills a few hundred bucks to spend a day working on it, and that's all good.

Could I get an LLM to do much of the work? Yes, but I could also do much of the work without an LLM. Someone who doesn't understand the first principles of domains, Wordpress, hosting and so on, not so much.

By ivape 2025-06-040:47

Except that FB/Twitter are rotting platforms.

They were not rotting platforms when they evaporated jobs at that particular moment, about 10-15 years ago. There's no universe where people are making money making websites. One could easily collect multi thousand dollars per month just making websites awhile ago before twitter/fb pages just on the side. There is a long history to web development.

Also, the day of the website has been over for quite awhile so I don't even buy the claim that social media is a rotting platform.

By daveguy 2025-06-0312:40

None of the LLM models are self-building, self-healing or even self-thinking or self-teaching. They are static models (+rag, but that's a bolt-on). Did you have a specific tech in mind?

By soraminazuki 2025-06-049:00

> We have to get real about what's about to happen.

Or maybe shouldn't enthusiastically repeat the destruction of the open web in favor of billionaire-controlled platforms for surveillance and manipulation.

By rustcleaner 2025-06-041:12

Start getting to be friends with some billionaire (or... shh... trillionaire) families, Elysium is coming!

By nonameiguess 2025-06-0315:072 reply

It's kind of ironic to me that this is so often the example trotted out. Look at the BLS data sheet for job outlook: https://www.bls.gov/ooh/sales/travel-agents.htm#tab-6

> Employment of travel agents is projected to grow 3 percent from 2023 to 2033, about as fast as the average for all occupations.

The last year there is data for claims 68,800 people employed as travel agents in the US. It's not a boom industry by any means, but it doesn't appear they experienced the apocalypse that Hacker News believes they did, either.

I don't know how to easily find historical data, unfortunately. BLS publishes the excel sheets, but pulling out the specific category would have to be done manually as far as I can tell. There's this, I guess: https://www.travelagewest.com/Industry-Insight/Business-Feat...

It appears at least that what happened is, though it may be easier than ever to plan your own travel, there are so many more people traveling these days than in the past that the demand for travel agents hasn't crashed.

By pvg 2025-06-0315:50

https://www.vice.com/en/article/why-are-travel-agents-still-...

Has some stats. It seems pretty clear the interests of travel agents did not count for much in the face of technological change.

By rerdavies 2025-06-0318:281 reply

https://fred.stlouisfed.org/series/LEU0254497900A

40% of all travel agent jobs lost between 2001 and 2025. Glad I'm not a travel agent.

By rerdavies 2025-06-0416:23

500,000 tech R&D jobs lost since 2017... Glad I'm not... Oh. Wait I AM!! Probably due to toxic Trumpian tax changes, though.

By soraminazuki 2025-06-049:23

Let's be real. Software engineers are skeptical right now not because they believe robots are better than them. Quite the opposite. The suits will replace software engineers despite its mediocrity.

It was just 2 weeks ago when the utter incompetence of these robots were in full public display [1]. But none of that will matter to greedy corporate executives, who will prioritize short-term cost savings. They will hop from company to company, personally reaping the benefits while undermining essential systems that users and society rely on with robot slop. That's part of the reason why the C-suites are overhyping the technology. After all, no rich executive has faced consequences for behaving this way.

It's not just software engineering jobs that will take a hit. Society as a whole will suffer from the greedy recklessness.

[1]: https://news.ycombinator.com/item?id=44050152

By chinchilla2020 2025-06-0315:151 reply

The reason I remain in the "skeptical" camp is because I am experiencing the same thing you are - I keep oscillating between being impressed, then disappointed.

Ultimately the thing that impresses me is that LLMs have replaced google search. The thing that disappoints me is that their code is often convincing but wrong.

Coming from a hard-engineering background, anything that is unreliable is categorized as bad. If you come from the move-fast-break-things world of tech, then your tolerance for mistakes is probably a lot higher.

By saltcured 2025-06-0315:403 reply

This is a bit tangential, but isn't that partly because google search keeps evolving into a worse resource due to the SEO garbage race?

By didibus 2025-06-0322:11

It is, AI lets you have an ad-free web browsing experience. This is a huge part of it as well.

By wussboy 2025-06-0316:29

And are LLMs immune to that same SEO garbage race?

By pxc 2025-06-0317:59

LLM-generated blogspam is also accelerating this process

By belter 2025-06-037:272 reply

> What fascinates me is how negative these comments are — how many people seem closed off to the possibility that this could be a net positive for software engineers rather than some kind of doomsday.

I tried the latest Claude for a very complex wrapper around the AWS Price APIs who are not easy to work with. Down a 2,000 line of code file, I found Claude faking some API returns by creating hard coded values. A pattern I have seen professional developers being caught on while under pressure to deliver.

This will be a boon to the human skilled developers, that will be hired at $900 dollars an hour to fix bugs of a subtlety never seen before.

By rollcat 2025-06-0319:15

More or less this. Maybe a job opportunity, but many decision makers won't see the real problem until they get hit by that AWS bill. Ironic, if the business won't hire you because they went out of business?

By DontchaKnowit 2025-06-0311:172 reply

I mean, that bug doesnt seem very subtle.

By belter 2025-06-0311:56

I swear this is not me..

"Claude gives up and hardcodes the answer as a solution" - https://www.reddit.com/r/ClaudeAI/comments/1j7tiw1/claude_gi...

By belter 2025-06-0311:36

I did not want to bend the truthfulness of my story, to make a valid logical argument more convincing... :-)

By osigurdson 2025-06-0314:561 reply

I have been using Windsurf for a few months and ChatGPT for a couple of years. I don't feel Windsurf is a massive game changer personally. It is good if you are very tired or working in a new area (also good for exploring UI ideas as the feedback loop is tight), but still not a real game changer over ChatGPT. Waiting around for it to do its thing ("we've encountered at error - no credits used") is boring and flow destroying. Of you know exactly what you are doing the productivity is probably 0.5 vs just typing the code in yourself. Sorry, I'm not going to bang around in Windsurf all day just to help with the training so that "v2" can be better. They should be paying me for this realistically.

Of course, in aggregate AI makes me capable in a far broader set of problem domains. It would be tough to live without it at this stage, but needs to be used for what it is actually good at, not what we hope it will be good at.

By ketzo 2025-06-0315:041 reply

Have you tried Cursor or Zed? I find they’re both significantly better in their “agent” modes than Windsurf.

By osigurdson 2025-06-0315:06

I used Cursor before Windsurf but I have not used Zed.

By osigurdson 2025-06-0312:11

The arguments seem to come down to tooling. The article suggests that ChatGPT isn't a good way to interact with LLMs but I'm not so sure. If the greatest utility is "rubber ducking" and editing the code yourself is necessary then tools like Cursor go too far in a sense. In my own experience, Windsurf is good for true vibe coding where I just want to explore an idea and throw away the code. It is still annoying though as it takes so long to do things - ruining any kind of flow state you may have. I am conversing with ChatGPT directly much more often.

I haven't tried Claud code yet however. Maybe that approach is more on point.

By eleveriven 2025-06-037:10

Totally agree with "vibe debt". Letting an LLM off-leash without checks is a fast track to spaghetti. But with tests, clear prompts, and some light editing, I’ve shipped a lot of real stuff faster than I could have otherwise.

By throwawayffffas 2025-06-039:52

I generally agree with the attitude of the original post as well. But I stick one one point. It definitely doesn't cost 20 dollars a month, cursor.ai might and I don't know how good it is, but claude code costs hundreds of dollars a month, still cheaper than a junior dev though.

By munificent 2025-06-0317:161 reply

> Did Photoshop kill graphic artists? Did film kill theatre?

To a first approximation, the answer to both of these is "yes".

There is still a lot of graphic design work out there (though generative AI will be sucking the marrow out of it soon), but far less than there used to be before the desktop publishing revolution. And the kind of work changed. If "graphic design" to you meant sitting at a drafting table with pencil and paper, those jobs largely evaporated. If that was a kind of work that was rewarding and meaningful to you, that option was removed for you.

Theatre even more so. Yes, there are still some theatres. But the number of people who get to work in theatrical acting, set design, costuming, etc. is a tiny tiny fraction of what it used to be. And those people are barely scraping together a living, and usually working side jobs just to pay their bills.

> it feels a bit like mourning the loss of punch cards when terminals showed up.

I think people deserve the right to mourn the loss of experiences that are meaningful and enjoyable to them, even if those experiences turn out to no longer be maximally economically efficient according to the Great Capitalistic Moral Code.

Does it mean that we should preserve antiquated jobs and suffer the societal effects of inefficiency without bound? Probably not.

But we should remember that the ultimate goal of the economic system is to enable people to live with meaning and dignity. Efficiency is a means to that end.

By pvg 2025-06-0318:062 reply

But the number of people who get to work in theatrical acting, set design, costuming

I think this ends up being recency bias and terminology hairsplitting, in the end. The number of people working in theatre mask design went to nearly zero quite a while back but we still call the stuff in the centuries after that 'theatre' and 'acting'.

By munificent 2025-06-0318:402 reply

I'm not trying to split hairs.

I think "theatre" is a fairly well-defined term to refer to live performances of works that are not strictly musical. Gather up all of the professions necessary to put those productions on together.

The number of opportunities for those professions today is much smaller than it was a hundred years ago before film ate the world.

There are only so many audience members and a night they spend watching a film or watching TV or playing videogames is a night they don't spend going to a play. The result is much smaller audiences. And with fewer audiences, there are fewer plays.

Maybe I should have been clearer that I'm not including film and video production here. Yes, there are definitely opportunities there, though acting for a camera is not at all the same experience as acting for a live audience.

By rightbyte 2025-06-0319:34

> I think "theatre" is a fairly well-defined term to refer to live performances of works

Doesn't it mean cinema too? edit: Even though it was clear from context you meant live theatre.

By pvg 2025-06-0318:511 reply

Right but modern theatre is pretty new itself. The number of people involved in performance for the enjoyment of others has spiked, err, dramatically. My point is that making this type argument seems to invariably involve picking some narrow thing and elevating it to a true and valuable artform deserving special consideration and mourning. Does it have a non-special-pleading variety?

By munificent 2025-06-0322:431 reply

Well, I didn't pick theatre and Photoshop as narrow things, the parent comment did.

I'm saying an artform that is meaningful to its participants and allows them to make a living wage while enriching the lives' of others should not be thoughtlessly discarded in slave to the almighty god of economic efficiency. It's not special pleading because I'd apply this to all artforms and all sorts of work that bring people dignity and joy.

I'm not a reactionary luddite saying that we should still be using oil streetlamps so we don't put the lamplighters out of work. But at the same time I don't think we should automatically and carelessly accept the decimation of human meaning and dignity at the altar of shareholder value.

By pvg 2025-06-041:551 reply

I'm not a reactionary luddite saying that we should still be using oil streetlamps so we don't put the lamplighters out of work.

No doubt. A few years ago there was some HN post with a video of the completely preposterous process of making diagrams for Crafting Interpreters. I didn't particularly need the book nor do I have room for it but I bought it there and then to support the spirit of all-consuming wankery. So I'm not here from Mitch & Murray & Dark Satanic Mills, Inc either. At the same time, I'm not sold on the idea niche art is the source of human dignity that needs societal protection, not because I'm some ogre but because I'm not convinced that's how actual art actually arts or provides meaning or evolves.

Like another Thomas put it

Not for the proud man apart

From the raging moon I write

On these spindrift pages

Nor for the towering dead

With their nightingales and psalms

But for the lovers, their arms

Round the griefs of the ages,

Who pay no praise or wages

Nor heed my craft or art.

By munificent 2025-06-042:26

> the spirit of all-consuming wankery.

Haha, a good way to describe it. :)

> the idea niche art is the source of human dignity that needs societal protection

I mean... have you looked around at the world today? We've got pick at least some sources of human dignity to protect because there seem to be fewer and fewer left.

By BobbyJo 2025-06-0320:081 reply

Sitting in a moving car and sitting on a moving horse are both called "riding", but I think we can all appreciate how useless it is to equate the two.

By pvg 2025-06-0320:211 reply

They aren't, broadly speaking, interesting forms of expression so the fact you can draw some trivial string match analogy doesn't seem worth much discussion.

By BobbyJo 2025-06-0418:35

That was my point. The fact that we call people wearing CGI suits hopping around green room, and people on stage playing a character for a crowd, both acting doesn't account for the fact that doing one doesn't mean you can do the other.

By throw310822 2025-06-037:125 reply

> Did Photoshop kill graphic artists?

No, but AI did.

By rerdavies 2025-06-039:382 reply

In actual fact, photoshop did kill graphic arts. There was an entire industry filled with people who had highly-developed skillsets that suddenly became obsolete. Painters for example. Before photoshop, I had to go out of house to get artwork done; now I just do it myself.

By hmcq6 2025-06-0310:05

No, it didn’t.

It changed the skill set but it didn’t “kill the graphic arts”

Rotoscoping in photoshop is rotoscoping. Superimposing an image on another in photoshop is the same as with film, it’s just faster and cheaper to try again. Digital painting is painting.

AI doesn’t require an artist to make “art”. It doesn’t require skill. It’s different than other tools

By hiddenfinance 2025-06-0310:041 reply

Even worse!!! What is consider art work now days are whatever that can be made on some vector based program. This really also stifles creativities, pigeonholing what is consider creative or art work into something can be used for machine learning.

Whatever can be replaced by AI will, cause it is easier for business people to deal with than real people.

By hmcq6 2025-06-0310:48

Most of the vector art I see is minimalism. I can’t see this as anything but an argument that minimalism “stifles creativity”

> vector art pigeonholes art into something that can be used for machine learning

Look around, AI companies are doing just fine with raster art.

The only thing we agree on is that this will hurt workers

By tptacek 2025-06-037:195 reply

This, as the article makes clear, is a concern I am alert and receptive to. Ban production of anything visual from an LLM; I'll vote for it. Just make sure they can still generate Mermaid charts and Graphviz diagrams, so they still apply to developers.

By hatefulmoron 2025-06-037:252 reply

What is unique about graphic design that warrants such extraordinary care? Should we just ban technology that approaches "replacement" territory? What about the people, real or imagined, that earn a living making Graphviz diagrams?

By omnimus 2025-06-038:494 reply

It’s more question of how it does what it does. By making statistical model out of work of humans that it now aims to replace.

I think graphic designers would be a lot less angry if AIs were trained on licensed work… thats how the system worked up until now after all.

By fennecfoxy 2025-06-039:211 reply

I don't think most artists would be any less angry & scared if AI was trained on licensed work. The rhetoric would just shift from mostly "they're breashing copyright!" to more of the "machine art is soulless and lacks true human creativity!" line.

I have a lot of artist friends but I still appreciate that diffusion models are (and will be with further refinement) incredibly useful tools.

What we're seeing is just the commoditisation of an industry in the same way that we have many, many times before through the industrial era, etc.

By omnimus 2025-06-0313:541 reply

It actually doesn't matter how would they feel. In currently accepted copyright framework if the works were licensed they couldn't do much about it. But right now they can be upset because suddenly new normal is massive copyright violation. It's very clear that without the massive amount of unlicensed work the LLMs simply wouldn't work well. The AI industry is just trying to run with it hoping nobody will notice.

By Amezarak 2025-06-0316:142 reply

It isn’t clear at all that there’s any infringement going on at all, except in cases where AI output reproduces copyrighted content or content that is sufficiently close to copyrighted content to constitute a derivative work. For example, if you told an LLM to write a Harry Potter fanfic, that would be infringement - fanfics are actually infringing derivative works that usually get a pass because nobody wants to sue their fanbase.

It’s very unlikely simply training an LLM on “unlicensed” work constitutes infringement. It could possibly be that the model itself, when published, would represent a derivative work, but it’s unlikely that most output would be unless specifically prompted to be.

By omnimus 2025-06-0321:021 reply

I am not sure why you would think so. AFAIK we will see more what courts think later in 2025 but judging from what was ruled in Delaware in feb... it is actually very likely that LLMs use of material is not "fair use" because besides "how transformed" work is one important part of "fair use" is that the output does not compete with the initial work. LLMs not only compete... they are specifically sold as replacement of the work they have been trained on.

This is why all the lobby now pushes the govs to not allow any regulation of AI even if courts disagree.

IMHO what will happen anyway is that at some point the companies will "solve" the licensing by training models purely on older synthetic LLM output that will be "public research" (which of course will have the "human" weights but they will claim it doesnt matter).

By Amezarak 2025-06-041:391 reply

What you are describing is the output of the LLM, not the model. Can you link to the case where a model itself was determined to be infringing?

It’s important that copyright applies to copying/publishing/distributing - you can do whatever you to copyrighted works by yourself.

By omnimus 2025-06-0412:141 reply

I dont follow. The artists are obviously complaining about the output that LLMs create. If you create LLM and dont use it then yeah nobody would have problem with it because nobody would know about it…

By Amezarak 2025-06-0414:181 reply

In that case, public services can continue to try to fine tune outputs to not generate anything infringing. They can train on any material they want.

Of course, that still won’t make artists happy, because they think things like styles can be copyrighted, which isn’t true.

By omnimus 2025-06-0419:391 reply

Any LLM output created with unlicensed sources is tainted. It doesnt matter if the output does not look like anything in the dataset. If you take out the unlicensed sources then you simply wont get the same result. An since the results directly compete with the source then its not “fair use”.

If we believe that authors should be able decide how their work is used then they can for sure say no machine learning. If we dont believe in intelectual property then anything is for grabs. I am ok with it but the corps are not.

By Amezarak 2025-06-053:14

That’s not how copyright law works, but it might be how it should work.

By nogridbag 2025-06-0317:171 reply

I'm interpreting what you described as a derivative work to be something like:

"Create a video of a girl running through a field in the style of Studio Ghibli."

There, someone has specifically prompted the AI to create something visually similar to X.

But would you still consider it a derivative work if you replaced the words "Studio Ghibli" with a few sentences describing their style that ultimately produces the same output?

By Amezarak 2025-06-041:41

Derivative work is a legal term. Art styles cannot be copyrighted.

By hatefulmoron 2025-06-038:581 reply

I get where you're coming from, but given that LLMs are trained on every available written word regardless of license, there's no meaningful distinction. Companies training LLMs for programming and writing show the same disregard for copyright as they do for graphic design. Therefore, graphic designers aren't owed special consideration that the author is unwilling to extend to anybody else.

By omnimus 2025-06-0313:47

Of course i think the same about text, code, sound or any other LLMs output. The author is wrong if they are unwilling to give same measure to everything. The fact this is new normal now for everything does not make it right.

By samcat116 2025-06-0314:191 reply

FWIW Adobe makes a lot of noise about how their specific models were indeed trained on only licensed work. Not sure if that really matters however

By omnimus 2025-06-0419:41

Yes Adobe and Shutterstock/Getty might be in position to do this.

But there is a reason why nobody cares about Adobe AI and everybody uses midjourney…

By ff317 2025-06-0413:11

I like this argument, but it does somewhat apply to software development as well! The only real difference is that the bulk of the "licensed work" the LLMs are consuming to learn to generate code happened to use some open source license that didn't specifically exclude use of the code as training data for an AI.

For some of the free-er licenses this might mostly be just a lack-of-attribution issue, but in the case of some stronger licenses like GPL/AGPL, I'd argue that training a commercial AI codegen tool (which is then used to generate commercial closed-source code) on licensed code is against the spirit of the license, even if it's not against the letter of the license (probably mostly because the license authors didn't predict this future we live in).

By tptacek 2025-06-037:271 reply

The article discusses this.

By hatefulmoron 2025-06-037:322 reply

Does it? It admits at the top that art is special for no given reason, then it claims that programmers don't care about copyright and they deserve what's coming to them, or something..

"Artificial intelligence is profoundly — and probably unfairly — threatening to visual artists"

This feels asserted without any real evidence

By tptacek 2025-06-037:357 reply

LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession, and does so by effectively counterfeiting creative expression. An coding agent writes a SQL join or a tree traversal. The two things are not the same.

Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

Finally, though I'm not stuck on this: I simply don't agree with the case being made for LLMs violating IPR.

I have had the pleasure, many times over the last 16 years, of expressing my discomfort with nerd piracy culture and the coercive might-makes-right arguments underpinning it. I know how the argument goes over here (like a lead balloon). You can agree with me or disagree. But I've earned my bona fides here. The search bar will avail.

By fennecbutt 2025-06-039:32

>bread-and-butter replacement-tier

How is creative expression required for such things?

Also, I believe that we're just monkey meat bags and not magical beings and so the whole human creativity thing can easily be reproduced with enough data + a sprinkle of randomness. This is why you see trends in supposedly thought provoking art across many artists.

Artists draw from imagination which is drawn from lived experience and most humans have roughly the same lives on average, cultural/country barriers probably produce more of a difference.

Many of the flourishes any artist may use in their work is also likely used by many other artists.

If I commission "draw a mad scientist, use creative license" from several human artists I'm telling you now that they'll all mostly look the same.

By thanksgiving 2025-06-038:091 reply

> Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

I think the case we are making is there is no such thing as intellectual property to begin with and the whole thing is a scam created by duck taping a bunch of different concepts together when they should not be grouped together at all.

https://www.gnu.org/philosophy/not-ipr.en.html

By rfrey 2025-06-0312:501 reply

That's exactly the point, it's hard to see how someone could hold that view and pillory AI companies for slurping up proprietary code.

You probably don't have those views. But I think Thomas' point is that the profession as a whole has been crying "information wants to be free" for so many years, when what they meant was "information I don't want to pay for wants to be free" - and the hostile response to AI training on private data underlines that.

By b3morales 2025-06-0820:50

Because it's rules for us and not for them. If I take Microsoft's code and "transform" it I get sued. If Microsoft takes everyone else's code and "transforms" it (and sells it back to us) well, that's just business, pal. Thomas's argument is completely missing this point.

EDIT to add, I said this more completely a while ago: https://news.ycombinator.com/item?id=34381996

By Jensson 2025-06-0313:42

> LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession, and does so by effectively counterfeiting creative expression. An coding agent writes a SQL join or a tree traversal. The two things are not the same.

In what way are these two not the same? It isn't like icons or ui panels are more original than the code that runs the app.

Or are you saying only artists are creating things of value and it is fine to steal all the work of programmers?

By oompty 2025-06-0310:47

What about ones trained on fully licensed art, like Adobe Firefly (based on their own stock library) or F-Lite by Freepik & Fal (also claimed to be copyright safe)?

By hatefulmoron 2025-06-037:431 reply

> LLMs immediately and completely displace the bread-and-butter replacement-tier illustration and design work that makes up much of that profession

And so what? Tell it to the Graphviz diagram creators, entry level Javascript programmers, horse carriage drivers, etc. What's special?

> .. and does so by effectively counterfeiting creative expression

What does this actually mean, though? ChatGPT isn't claiming to have "creative expression" in this sense. Everybody knows that it's generating an image using mathematics executed on a GPU. It's creating images. Like an LLM creates text. It creates artwork in the same sense that it creates novels.

> Far more importantly, though, artists haven't spent the last quarter century working to eliminate protections for IPR. Software developers have.

Programmers are very particular about licenses in opposition to your theory. Copyleft licensing leans heavily on enforcing copyright. Besides, I hear artists complain about the duration of copyright frequently. Pointing to some subset of programmers that are against IPR is just nutpicking in any case.

By tptacek 2025-06-037:491 reply

Oh, for sure. Programmers are very particular about licenses. For code.

By hatefulmoron 2025-06-038:092 reply

I get it, you have an axe to grind against some subset of programmers who are "nerds" in a "piracy culture". Artists don't deserve special protections. It sucks for your family members, I really mean that, but they will have to adapt with everybody else.

By mwcampbell 2025-06-038:252 reply

I disagree with you on this. Artists, writers, and programmers deserve equal protection, and this means that tptacek is right to criticize nerd piracy culture. In other words, we programmers should respect artists and writers too.

By hatefulmoron 2025-06-038:42

To be clear, we're not in disagreement. We should all respect each other. However, it's pretty clear that the cat's out of the bag, and trying to claw back protections for only one group of people is stupid. It really betrays the author's own biases.

By victorbjorklund 2025-06-0312:05

Doubt it is the same people. I doubt anyone argues that paintings deserve no protection while code does it.

By tptacek 2025-06-0318:571 reply

I do have an axe to grind, and that part of the post is axe-grindy (though: it sincerely informs how I think about LLMs), I knew that going into it (unanimous feedback from reviewers!) and I own it.

By marcusb 2025-06-0323:391 reply

I generally agree with your post. Many of the arguments against LLMs being thrown around are unserious, unsound, and a made-for-social-media circle jerk that don't survive any serious adversarial scrutiny.

That said, this particular argument you are advancing isn't getting so much heat here because of an unfriendly audience that just doesn't want to hear what you have to say. Or that is defensive because of hypocrisy and past copyright transgressions. It is being torn apart because this argument that artists deserve protection, but software engineers don't is unsound special pleading of the kind you criticize in your post.

Firstly, the idea that programmers are uniquely hypocritical about IPR is hyperbole unsupported by any evidence you've offered. It is little more than a vibe. As I recall, when Photoshop was sold with a perpetual license, it was widely pirated. By artists.

Secondly, the idea -- that you dance around but don't state outright -- that programmers should be singled out for punishment since "we" put others out of work is absurd and naive. "We" didn't do that. It isn't the capital owners over at Travelocity that are going to pay the price for LLM displacement of software engineers, it is the junior engineer making $140k/year with a mortgage.

Thirdly, if you don't buy into LLM usage as violating IPR, then what exactly is your argument against LLM use for the arts? Just a policy edict that thou shalt not use LLMs to create images because it puts some working artists out of business? Is there a threshold of job destruction that has to occur for you to think we should ban LLMs use case by use case? Are there any other outlaws/scarlet-letter-bearers in addition to programmers that will never receive any policy protection in this area because of real or perceived past transgressions?

By tptacek 2025-06-040:061 reply

Adobe is one of the most successful corporations in the history of commerce; the piracy technologists enabled wrecked most media industries.

Again, the argument I'm making regarding artists is that LLMs are counterfeiting human art. I don't accept the premise that structurally identical solutions in software counterfeit their originals.

By marcusb 2025-06-0423:151 reply

> Adobe is one of the most successful corporations in the history of commerce; the piracy technologists enabled wrecked most media industries.

I guess that makes it ok then for artists to pirate Adobe's product. Also, I live in a music industry hub -- Nashville -- you'll have to forgive me if I don't take RIAA at their word that the music industry is in shambles, what with my lying eyes and all.

> Again, the argument I'm making regarding artists is that LLMs are counterfeiting human art. I don't accept the premise that structurally identical solutions in software counterfeit their originals.

I'm aware of the argument you are making. I imagine most of the people here understand the argument you are making. Its just a really asinine argument and is propped up by all manner of special pleading (but art is different, programmers are all naughty pirates that deserve to be punished) and appeals to authority (check my post history - I've established my bona fides.)

There simply is no serious argument to be made that LLMs reproducing one work product and displacing labor is better or worse than an LLM reproducing a different work product and displacing labor. Nobody is going to display some ad graphic from the local botanical garden's flyer for their spring gala at The Met. That's what is getting displaced by LLM. Banksy isn't being put out of business by stable diffusion. The person making the ad for the botanical garden's flyer has market value because they know how to draw things that people like to see in ads. A programmer has value because they know how to write software that a business is willing to pay for. It is as elitist as it is incoherent to say that one person's work product deserves to be protected but another person's does not because of "creativity."

Your argument holds no more water and deserves to be taken no more seriously than some knucklehead on Mastodon or Bluesky harping about how LLMs are going to cause global warming to triple and that no output LLMs produce has any value.

By tptacek 2025-06-057:391 reply

Well, I disagree with you. For the nth time, though, I also don't grant the premise that LLMs are violative of the IPR of programmers. But more importantly than anything else, I just don't want to hear any of this from developers. That's not "your arguments are wrong and I have refuted them". It's "I'm not going to hear them from you".

By marcusb 2025-06-0519:53

> For the nth time, though, I also don't grant the premise that LLMs are violative of the IPR of programmers.

I wish you all the best waiting for a future where the legislature and courts decide that LLM output is violative of copyright law only in the visual arts.

> I just don't want to hear any of this from developers.

Well, you seem to have posted about the wrong topic in the wrong forum then. But you’ve heard what you’ve wanted to hear in the discussion related to this post, so maybe that doesn’t really matter.

By ivape 2025-06-038:381 reply

counterfeiting creative expression

This is the only piece of human work left in the long run, and that’s providing training data on taste. Once we hook up a/b testing on ai creative outputs, the LLM will know how to be creative and not just duplicative. The ai will never have innate taste, but we can feed it taste.

We can also starve it of taste, but that’s impossible because humans can’t stop providing data. In other words, never tell the LLM what looks good and it will never know. A human in the most isolated part of the world can discern what creation is beautiful and what is not.

By fennecbutt 2025-06-039:341 reply

Everything is derivative, even all human work. I don't think "creativity" is that hard to replicate, for humans it's about lived experience. For a model it would need the data that impacts its decisions. Atm models are trained for a neutral/overall result.

By hmcq6 2025-06-0310:39

Your premise is an axiom that I don’t think most would accept.

Is the matrix a ripoff of the Truman show? Is Oldboy derivative of Oedipus?

Saying everything is derivative is reductive.

By GoblinSlayer 2025-06-0314:32

Modern flat graphic style has basically zero quality, I drew one myself even though I'm absolutely incompetent in proper drawing.

By palmfacehn 2025-06-038:42

>This feels asserted without any real evidence

Things like this are expressions of preference. The discussion will typically devolve into restatements of the original preference and appeals to special circumstances.

By speleding 2025-06-039:26

Hasn't that ship sailed? How would any type of ban work when the user can just redirect the banned query to a model in a different jurisdiction, for example, Deepseek? I don't think this genie is going back into the bottle, we're going to have to learn to live with it.

By victorbjorklund 2025-06-0312:04

Why not the same for texts? Why are shitty visual art more worth than the best texts from beloved authors? And what about cooking robots? Should we not protect the culinary arts?

By throw310822 2025-06-039:08

> Ban production of anything visual from an LLM

That's a bit beside the point, which is that AI will not be just another tool, it will take ALL the jobs, one after another.

I do agree it's absolutely great though, and being against it is dumb, unless you want to actually ban it- which is impossible.

By GoblinSlayer 2025-06-0314:43

On the other hand it can revive dead artists. How about AI generated content going gpl in 100 days after release?

By Hoasi 2025-06-038:47

Well, this is only partially true. My optimistic take is that it will redefine the field. There is still a future for resourceful, attentive, and prepared graphic artists.

By ttyyzz 2025-06-037:302 reply

AI didn't kill creativity nor intuition. It much rather lack's those things completely. Artists can make use of AI but they can't make themselves obsolete just yet.

By rvnx 2025-06-039:184 reply

With AI anyone can be an artist, and this is a good thing.

By Sohcahtoa82 2025-06-0316:29

Prompting Midjourney or ChatGPT to make an image does not make you an artist.

By python-b5 2025-06-0316:021 reply

Using AI makes you an artist about as much as commissioning someone else to make art for you does. Sure, you provided the description of what needed to be done, and likely gave some input along the way, but the real work was done by someone else. There are faster iteration times with AI, but you are still not the one making the art. That is what differentiates generative models from other kinds of tools.

By iszomer 2025-06-0612:36

Imagine when the commissioned artist uses AI themselves but this goes deep down the rabbit hole of who gets the spread on potential attribution of said "work".

By hmcq6 2025-06-0310:591 reply

AI can’t make anyone a painter. It can generate a digital painting for you but it can’t give you the skills to transfer an image from your mind into the real world.

AI currently can’t reliably make 3d objects so AI can’t make you a sculptor.

By Flemlo 2025-06-0313:30

We now have wall printers based on UV paint.

3D models can be generated quite well already. Good enough for a sculpture.

By emptyfile 2025-06-0316:38

[dead]

By throw310822 2025-06-039:121 reply

> AI didn't kill creativity nor intuition. It much rather lack's those things completely

Quite the opposite, I'd say that it's what it has most. What are "hallucinations" if not just a display of immense creativity and intuition? "Here, I'll make up this API call that's I haven't read about anywhere but sounds right".

By ttyyzz 2025-06-0310:262 reply

I disagree. AI is good at pattern recognition, but still struggles to grasp causual relationships. These Made-up api calls are just a pattern in the large data set. Dont confuse it with creativity.

By throw310822 2025-06-0310:33

I would definitely confuse that with "intuition"- which I would describe it as seeing and using weak, unstated relationships, aka patterns. That's my intuition, at least.

As to creativity, that's something I know too little about to define it, but it seems reasonable that it's even more "fuzzy" than intuition. On the opposite, causal relationships are closer to hard logic, which is what LLMs struggle with- as humans do, too.

By MrScruff 2025-06-0312:28

A lot of art is about pattern recognition. We represent or infer objects or ideas through some indirection or abstraction. The viewer or listener's brain (depending on their level of sophistication) fills in the gaps, and the greater the level of indirection (or complexity of pattern recognition required) the greater the emotional payoff. This also applies to humour.

By ZaoLahma 2025-06-039:282 reply

It will not.

I'm an engineer through and through. I can ask an LLM to generate images just fine, but for a given target audience for a certain purpose? I would have no clue. None what so ever. Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start. I have absolutely no understanding of the advertisement domain, and I don't know what tired parents find visually pleasing, or what they would "vibe" with.

My feeble attempts would be absolute trash compared to a professional artist who uses AI to express their vision. The artist would be able to prompt so much more effectively and correct the things that they know from experience will not work.

It's the exact same as with coding with an AI - it will be trash unless you understand the hows and the whys.

By throw310822 2025-06-039:42

> Ask me to generate an image to use in advertisement for Nuka Cola, targeting tired parents? I genuinely have no idea of where to even start.

I believe you, did you try asking ChatGPT or Claude though?

You can ask them a list of highest-level themes and requirements and further refine from there.

By fennecbutt 2025-06-039:36

Have you seen modern advertisements lmao? Most of the time the ad has nothing to do with the actual product, it's an absolute shitshow.

Although I've seen a little American TV ads before, that shit's basically radioactively coloured, same as your fizzy drinks.

By didibus 2025-06-0322:093 reply

I agree with the potential of AI. I use it daily for coding and other tasks. However, there are two fundamental issues that make this different from the Photoshop comparison.

The models are trained primarily on copyrighted material and code written by the very professionals who now must "upskill" to remain relevant. This raises complex questions about compensation and ownership that didn't exist with traditional tools. Even if current laws permit it, the ethical implications are different from Photoshop-like tools.

Previous innovations created new mediums and opportunities. Photoshop didn't replace artists, because it enabled new art forms. Film reduced theater jobs but created an entirely new industry where skills could mostly transfer. Manufacturing automation made products like cars accessible to everyone.

AI is fundamentally different. It's designed to produce identical output to human workers, just more cheaply and/or faster. Instead of creating new possibilities, it's primarily focused on substitution. Say AI could eliminate 20% of coding jobs and reduce wages by 30%:

    * Unlike previous innovations, this won't make software more accessible
    * Software already scales essentially for free (build once, used by many)
    * Most consumer software is already free (ad-supported)

The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.

This isn't an argument against progress, but we should be clear-eyed about how this transition differs from historical parallels, and why it might not repeat the same historical outcomes. I'm not claiming this will be the case, but that you can see some pretty significant differences for why you might be skeptical that the same creation of new jobs, or improvement to human lifestyle/capabilities will emerge as with say Film or Photoshop.

AI can also be used to achieve things we could not do without, that's the good use of AI, things like Cancer detection, self-driving cars, and so on. I'm speaking specifically of the use of AI to automate and reduce the cost/speed of white collar work like software development.

By throw234234234 2025-06-0323:41

For me this is the "issue" I have with AI. Unlike say the internet, mobile and other tech revolutions where I could see new use cases or existing use case optimisation spring up all the time (new apps, new ways of interacting, more efficient than physical systems, etc) AI seems to be focused more on efficiency/substitution of labour than pushing the frontier on "quality of life". Maybe this will change but the buzz is around job replacement atm.

Its why it is impacting so many people, but also having very small changes to everyday "quality of life" kind of metrics (e.g. ability to eat, communicate, live somewhere, etc). It arguably is more about enabling greater inequality and gatekeeping of wealth to capital - where intelligence and merit matters less in the future world. For most people its hard to see where the positives are for them long term in this story; most everyday folks don't believe the utopia story is in anyway probable.

By nmgycombinator 2025-06-065:52

> The primary outcome appears to be increased profit margins rather than societal advancement. While previous technological revolutions created new industries and democratized access, AI seems focused on optimizing existing processes without providing comparable societal benefits.

This is the thing that worries me the most about AI.

The author's ramblings dovetails with this a bit in their "but the craft" section. They vaguely attack the idea of code-golfing and focusing on coding for the craft as essentially incompatible with the corporate model of programming work. And perhaps they're right. If they are, though, this AI wave/hype being mostly about process-streamlining and such seems to be a distillation of that fact.

By GoblinSlayer 2025-06-047:56

Maybe it's like automation that makes webdev accessible to anyone. You take a week long AI coaching course and talk to an AI and let it throw together a website in an hour, then you self host it.

By whazor 2025-06-038:072 reply

The key is that manual coding for a normal task takes a one/two weeks, where-as if you configure all your prompts/agents correctly you could do it in a couple of hours. As you highlighted, it brings many new issues (code quality, lack of tests, tech debt) and you need to carefully create prompts and review the code to tackle those. But in the end, you can save significant time.

By mdavid626 2025-06-0310:002 reply

I disagree. I think this notion comes from the idea that creating software is about coding. Automating/improving coding => you have software at the end.

This might be how one looks at it in the beginning, when having no experience or no idea about coding. With time one will realize it's more about creating the correct mental model of the problem at hand, rather than the activity of coding itself.

Once this realized, AI can't "save" you days of work, as coding is the least time consuming part of creating software.

By rerdavies 2025-06-0310:142 reply

The actual most time-consuming parts of creating software (I think) is reading documentation for the APIs and libraries you're using. Probably the biggest productivity boost I get from my coding assistant is attributable to that.

e.g: MUI, typescript:

   // make the checkbox label appear before the checkbox.

Tab. Done. Delete the comment.

vs. about 2 minutes wading through the perfectly excellent but very verbose online documentation to find that I need to set the "labelPlacement" attribute to "start".

Or the tedious minutia that I am perfectly capable of doing, but it's time consuming and error-prone:

    // execute a SQL update

Tab tab tab tab .... Done, with all bindings and fields done, based on the structure that's passed as a parameter to the method, and the tables and fieldnames that were created in source code above the current line. (love that one).

By ACS_Solver 2025-06-0312:54

Yes, I currently lean skeptical but agentic LLMs excel at this sort of task. I had a great use just yesterday.

I have an older Mediawiki install that's been overrun by spam. It's on a server I have root access on. With Claude, I was able to rapidly get some Python scripts that work against the wiki database directly and can clean spam in various ways, by article ID, title regex, certain other patterns. Then I wanted to delete all spam users - defined here as users registered after a certain date whose only edit is to their own user page - and Claude made a script for that very quickly. It even deployed with scp when I told it where to.

Looking at the SQL that ended up in the code, there's non-obvious things such as user pages being pages where page_namespace = 2. The query involves the user, page, actor and revision tables. I checked afterwards, MediaWiki has good documentation for its database tables. Sure, I could have written the SQL myself based on that documentation, but certainly not have the query wrapped in Python and ready to run in under a minute.

By andybp85 2025-06-0311:302 reply

what are you using for this? one thing I can't wrap my head around is how anyone's idea of fun is poking at an LLM until it generates something possibly passable and then figuring what the hell it did and why, but this sounds like something i'd actually use.

By rerdavies 2025-06-0317:31

Yes. Vscode/Copilot/Claude Sonnet 4. The choice of AI may make a significant difference. It used to. The GPT AIs, particularly, were useless. I haven't tried GPT 4.1 yet.

By calvinmorrison 2025-06-0311:581 reply

vscode?

By andybp85 2025-06-0312:201 reply

vscode comes with that out of the box?

By calvinmorrison 2025-06-0312:421 reply

pretty much, the plugin is called Copilot. click a button and install it

By andybp85 2025-06-0313:031 reply

that's a whole different piece of software that it doesn't come with out of the box lol.

Copilot was what i was looking for, thank you. I have it installed in Webstorm already but I haven't messed with this side of it.

By calvinmorrison 2025-06-0313:501 reply

idk i click 'copilot' and it adds it. it took maybe 2 minutes.

By rerdavies 2025-06-0316:05

And if you don't add it, VSCode nags you incessantly until you do. :-P

By 01100011 2025-06-0317:32

I think for some folks their job really is just about coding. For me that was rarely true. I've written very little code in my career. Mostly design work, analyzing broken code, making targeted fixes...

I think these days coding is 20% of my job, maybe less. But HN is a diverse audience. You have the full range of web programmers and data scientists all the way to systems engineers and people writing for bare metal. Someone cranking out one-off Python and Javascript is going to have a different opinion on AI coding vs a C/C++ systems engineer and they're going to yell at each other in comments until they realize they don't have the same job, the same goals or the same experiences.

By drited 2025-06-038:111 reply

Would you have any standard prompts you could share which ask it to make a draft with you'd want (eg unit tests etc)?

By rerdavies 2025-06-0310:021 reply

    C++, Linux: write an audio processing loop for ALSA    
    reading audio input, processing it, and then outputting
    audio on ALSA devices. Include code to open and close
    the ALSA devices. Wrap the code up in a class. Use 
    Camelcase naming for C++ methods.
    Skip the explanations.

``` Run it through grok:

    https://grok.com/

When I ACTUALLY wrote that code the first time, it took me about two weeks to get it right. (horrifying documentation set, with inadequate sample code).

Typically, I'll edit code like this from top to bottom in order to get it to conform to my preferred coding idioms. And I will, of course, submit the code to the same sort of review that I would give my own first-cut code. And the way initialization parameters are passed in needs work. (A follow-on prompt would probably fix that). This is not a fire and forget sort of activity. Hard to say whether that code is right or not; but even if it's not, it would have saved me at least 12 days of effort.

Why did I choose that prompt? Because I have learned through use that AIs do will well with these sorts of coding tasks. I'm still learning, and making new discoveries every day. Today's discovery: it is SO easy to implement SQLLite database in C++ using an AI when you go at it the right way!

By skydhash 2025-06-0316:101 reply

That rely heavily on your mental model of ALSA to write a prompt like that. For example, I believe macOS audio stack is node based like pipewire. For someone who is knowledgeable about the domain, it's easy enough to get some base output to review and iterate upon. Especially if there was enough training data or you constrain the output with the context. So there's no actual time saving because you have to take in account the time you spent learning about the domain.

That is why some people don't find AI that essential, if you have the knowledge, you already know how to find a specific part in the documentation to refresh your semantics and the time saved is minuscule.

By rerdavies 2025-06-0316:281 reply

Fer goodness sake. Eyeroll.

   Write an audio processing loop for pipewire. Wrap the code up in a 
   C++ class. Read audio data, process it and output through an output 
   port. Skip the explanations. Use CamelCase names for methods.
   Bundle all the configuration options up into a single
   structure.

Run it through grok. I'd actually use VSCode Copilot Claude Sonnet 4. Grok is being used so that people who do not have access to a coding AI can see what they would get if they did.

I'd use that code as a starting point despite having zero knowledge of pipewire. And probably fill in other bits using AI as the need arises. "Read the audio data, process it, output it" is hardly deep domain knowledge.

By skydhash 2025-06-0317:15

Results with gemini

https://pastebin.com/6b4yhfYw

A 5 second search on DDG ("easyeffects") and a 10 second navigation on github.

https://github.com/wwmm/easyeffects/blob/master/src/plugin_b...

But that is GPL 3.0 and a lot of people want to use the license laundering LLM machine.

N.B. I already know about easyeffects from when I was seeking for a software equalizer

EDIT

Another 30 seconds exploration ("pipewire" on DDG, finding the main site, then goes on the documentation page, and the tutorial section).

https://docs.pipewire.org/audio-dsp-filter_8c-example.html

There's a lot of way to find truthful information without playing Russian roulette with an LLM.

By hiddenfinance 2025-06-0310:07

The question is can I self-host this "mech suit"? If not, I would much not use some API hosted by another party.

Saas just seems very much like a terminator seed situation in the end.

By dogcomplex 2025-06-037:22

"Mech suit" is apt. Gonna use that now.

Having plenty of initial discussion and distilling that into requirements documents aimed for modularized components which can all be easily tackled separately is key.

By Jordanpomeroy 2025-06-0317:35

This is my experience as well.

I’d add that Excel didn’t kill the engineering field. It made them more effective and maybe companies will need less of them. But it also means more startups and smaller shops can make use of an engineer. The change is hard and an equilibrium will be reached.

By giancarlostoro 2025-06-0316:231 reply

> Then something shifted. I started experimenting. I stopped giving it orders and began using it more like a virtual rubber duck. That made a huge difference.

This is how I use it mostly. I also use it for boilerplate, like "What would a database model look like that handles the following" you never want it to do everything, though there are tools that can and will and they're impressive, but then when you have a true production issue, your inability to quickly respond will be a barrier.

By nashashmi 2025-06-0316:30

That’s all great news that if you know how to use an LLM, it works wonders for you. But LLMs are changing so fast, can it really be sustainable for me to “learn” it only for it to change and go backwards the next month? (I am thinking about how terrible Google became.)

By conradev 2025-06-0315:361 reply

I’m learning live how to use these things better, and I haven’t seen practical guides like:

- Split things into small files, today’s model harnesses struggle with massive files

- Write lots of tests. When the language model messes up the code (it will), it can use the tests to climb out. Tests are the best way to communicate behavior.

- Write guides and documentation for complex tasks in complex codebases. Use a language model for the first pass if you’re too lazy. Useful for both humans and LLMs

It’s really: make your codebase welcoming for junior engineers

By gs17 2025-06-0316:011 reply

> it can use the tests to climb out

Or not. I watched Copilot's agent mode get stuck in a loop for most of an hour (to be fair, I was letting it continue to see how it handles this failure case) trying to make a test pass.

By conradev 2025-06-0316:23

Yeah! When that happens I usually stop it and tap in a bigger model to “think” and get out of the loop (or fix it myself)

I’m impressed with this latest generation of models: they reward hack a lot less. Previously they’d change a failing unit test, but now they just look for reasonable but easy ways out in the code.

I call it reward hacking, and laziness is not the right word, but “knowing what needs to be done and not doing it” is the general issue here. I see it in junior engineers occasionally, too.

By H1Supreme 2025-06-0516:22

> Did Photoshop kill graphic artists?

Desktop publication software killed many jobs. I worked for a publication where I had colleagues that used to typeset, place images, and use a camera to build pages by hand. That required a team of people. Once Quark Xpress and the like hit the scene, one person could do it all, faster.

In terms of illustration, the tools moved from pen and paper to Adobe Illustrator and Aldus / Macromedia Freehand. Which I'd argue was more of a sideways move. You still needed an illustrators skillset to use these tools.

The difference between what I just described and LLM image generation is the tooling changed to streamline an existing skillset. LLM's replace all of it. Just type something and here's your picture. No art / design skill necessary. Obviously, there's no guarantee that the LLM generated image will be any good. So, I'm not sure the Photoshop analogy works here.

By timcobb 2025-06-0318:24

> Did film kill theatre?

Relatively speaking, I would say that film and TV did kill theater

By richardw 2025-06-0310:29

Photoshop etc are still just tools. They can’t beat us at what has always set us apart: thinking. LLM’s are the closest, and while they’re not close they’re directionally correct. They’re general purpose, not like chess engines. And they improve. It’s hard to predict a year out, never mind ten.

By MattGrommes 2025-06-0318:15

I agree, this article is basically what I've been thinking as I play with these things over time. They've gotten a ton better but the hot takes are still from 6-12 months ago.

One thing I wish he would have talked about though is maintenance. My only real qualm with my LLM agent buddy is the tendency to just keep adding code if the first pass didn't work. Eventually, it works, sometimes with my manual help. But the resulting code is harder to read and reason about, which makes maintenance and adding features or behavior changes harder. Until you're ready to just hand off the code to the LLM and not do your own changes to it, it's definitely something to keep in mind at minimum.

By jlaternman 2025-06-0422:57

Yes! It needs and seems to want the human to be a deep collaborator. If you take that approach, it is actually a second senior developer you can work with. You need to push it, and explain the complexities in detail to get fuller rewards. And get it to document everything important it learns from each session's context. It wants to collaborate to make you a 10X coder, not to do your work for you while you laze. That is the biggest breakthrough I have found. They basically react like human brains, with the same kind of motives. Their output can vary dramatically based on the input you provide.

By timeon 2025-06-0320:27

> Did Photoshop kill graphic artists? Did film kill theatre? Not really. Things changed, sure. Was it “better”?

My obligatory comment how analogies are not good for arguments: there is already discussion here that film (etc.) may have killed theatre.

By sim7c00 2025-06-0418:41

i love your views and way to express it, spot on. i feel similar in some ways. i hated ai, loved ai, hated it again and love it again. i still feel the code i unusable for my main problems, but i realize better its my arrogance that causes it. i cant formulate solutions eloquently enough and blame the AI for bad code.

AI has helped me pick up my pencil and paper again and realize my flawed knowledge, skills, and even flawed approach to AI.

Now i instructed it to never give me code :). not because the code is bad, but my attempts to extract code from it are more based in laziness than efficiency. they are easy to confuse afterall ;(....

I have tons of fun learning with AI, exploring. going on adventures into new topics. Then when i want to really do something, i try to use it for the things i know i am bad at due to laziness, not lack of knowledge. the thing i fell for first...

it helps me explore a space, then i think or am inspired for some creation, and it helps me structure and plan. when i ask it from laziness to give me the code, it helps me overcome my laziness by explaining what i need to do to be able to see why asking for the code was the wrong approach in the first place.

now, that might be different for you. but i have learned i am not some god tier hacker from the spawl, so i realized i need to learn and get better. perhaps you are at the level you can ask it for code and it just works. hats off in that case ;k (i do hope you tested well!)

By caycep 2025-06-0316:512 reply

I think also the key is - don't call it AI, because it's not. It's LLM assist query parsing and code generation. Semantically, if you call it AI, the public expects a cognitive equivalent to a human which this is not, and from what @tptacek describes, is not meant to be - the reasoning and other code bits to create agents and such seem to be developed specifically for code generation and programming assist and other tasks thereof. Viewed in that lens, the article is correct - it is by all means a major step forward.

By digianarchist 2025-06-0317:22

I agree but that battle is lost. Someone was calling Zapier workflows AI on X.

By bbarnett 2025-06-0318:12

AGI vs AI is how to separate this these days.

By notindexed 2025-06-0314:481 reply

The irony of the ChatGPT em dashes ;3

By bytesandbots 2025-06-0422:351 reply

The entire comment feels way too long, structured and convincing in a way that can only be written by an AI. I just hope that once the em-dashes are "fixed", we still be able to detect such text. I fear for a future when human text is sparse, even here at HN. It is depressing to see such a comment take the top spot.

By volkk 2025-06-0514:06

Lol -- it even reads with the same exact tone as AI. For those that use it often, it's so easy to spot now. The luddites on HN that fear AI feel end up affected the most because they have no idea how to see it.

By bytesandbots 2025-06-0422:281 reply

I am pretty sure this comment is also AI generated. Just a guess but so many em-dash is suspicious. And the overall structure of convincing feels uncanny.

If this is true, can you share your initial draft that you asked the AI to rewrite. Am I not right that the initial draft is more concise and better conveys your actual thought, even though it's not as much convincing.

By zedcombination 2025-06-0616:37

Definitely. So many people taken in by it!

By billy99k 2025-06-0317:24

I use LLMs daily. From helping me write technical reports (not 100%, mostly making things sound better after I have a first draft) to mapping APIs (documentation, etc).

I can only imagine what this technology will be like in 10 years. But I do know that it's not going anywhere and it's best to get familiar with it now.

By taylodl 2025-06-0313:451 reply

I treat AI as my digital partner in pair programming. I've learned how to give it specific and well-defined tasks to do, and it gets it done. The narrower the scope and more specific the task then the more successful you'll have.

By jes5199 2025-06-0314:451 reply

there’s a sweet spot in there, it’s not “as narrow as possible” - the most productive thing is to assign the largest possible tasks that are just short of the limit where the agents become stupid. this is hard to hit, and a moving target!

By svachalek 2025-06-0317:40

Exactly. When you get a new tool or a new model, ask it for things the previous one failed at until you find the new ceiling.

By nipah 2025-06-1113:55

> What’s clear is this tech is here now, and complaining about it feels a bit like mourning the loss of punch cards when terminals showed up.

Just stop with this, it's bulshitty. There's nothing related between LLMs and the migration from punch cards to terminals, nor to photoshop compared to film theatre, literally [nothing]. This is a pretty underwelming way of trying to say people that are critiques of this are akin to nostalgic people that "miss the old good days", when there are more than enough pertinent reasons to disagree with this tech in this case. Basically calling opposing people irrational.

I'm not talking about doom or software dev dying or any bullshit like that, I'm just saying this kind of point you make in the end is not reasonable.

By brianjking 2025-06-040:56

Love all of this.

Most importantly, I'll embrace the change and hope for the possible abundance.

By beloch 2025-06-0319:30

LLM's are self-limiting, rather than self-reinforcing, and that's the big reason why they're not the thing, both good or bad, that some people think they are.

"Garbage in, garbage out", is still the rule for LLM's. If you don't spend billions training them or if you let them feed on their own tail too much they produce nonsense. e.g. Some LLM's currently produce better general search results than google. This is mainly a product of many billions being spent on expert trainers for those LLM's, while google neglects (or actively enshitifies) their search algorithms shamefully. It's humans, not LLM's, producing these results. How good will LLM's be at search once the money has moved somewhere else and neglect sets in?

LLM's aren't going to take everyone's jobs and trigger a singularity precisely because they fall apart if they try to feed on their own output. They need human input at every stage. They are going to take some people's jobs and create new ones for others, although it will probably be more of the former than the latter, or billionaires wouldn't be betting on them.

By ljsprague 2025-06-0319:21

Yes, film killed theatre.

By 0points 2025-06-037:024 reply

> Then I actually read the code.

This is my experience in general. People seem to be impressed by the LLM output until they actually comprehend it.

The fastest way to have someone break out of this illusion is tell them to chat with the LLM about their own expertise. They will quickly start to notice errors in the output.

By wiseowise 2025-06-0311:351 reply

You know who does that also? Humans. I read shitty, broken, amazing, useful code every day, but you don’t see my complaining online that people who earn 100-200k salary don’t produce ideal output right away. And believe me, I spend way more time fixing their shit than LLMs.

If I can reduce this even by 10% for 20 dollars it’s a bargain.

By ehutch79 2025-06-0312:035 reply

But no one is hyping the fact that Bob the mediocre coder is going to replace us.

By code_for_monkey 2025-06-0314:22

what no one is reckoning with right here

The AI skeptics are mostly correctly reacting to the AI hypists, who are usually shitty linkedin influencer type dudes crowing about how they never have to pay anyone again. its very natural, even intelligent to not trust this now that its filling the same bubble as NFTs a few years ago. I think its okay to stay skeptical and see where the chips fall in a few years at this point.

By capiki 2025-06-0312:172 reply

But Bob isn’t getting better every 6 months

By ehutch79 2025-06-0312:53

If you’re not getting any better you are indeed in trouble.

By barrell 2025-06-0312:262 reply

I’ve definitely improved at a faster rate than LLMs over the last 6 months

By capiki 2025-06-0313:461 reply

Let’s see the evals

By sksisoakanan 2025-06-0314:06

https://the-decoder.com/openai-quietly-funded-independent-ma...

You mean these?

I use AI everyday but you’ve got hundreds of billions of dollars and Scam Altman (known for having no morals and playing dirty) et al on “your” side. The only thing AI skeptics have is anecdotes and time. Having a principled argument isn’t really possible.

By worthless-trash 2025-06-0312:46

[flagged]

By datadrivenangel 2025-06-0313:16

Offshoring / nearshoring has been here for decades!

By wiseowise 2025-06-0319:29

By p3rls 2025-06-0312:261 reply

[flagged]

By ehutch79 2025-06-0312:511 reply

If that was true why would we be told llms would replace us? Would us having already been replaced mean we weren’t working?

Also this is kind of racist.

By p3rls 2025-06-0421:31

Do you know what the word most means?

By tptacek 2025-06-037:203 reply

That has not been my experience at all with networking and cryptography.

By jhanschoo 2025-06-037:41

Your comment is ambiguous; what exactly do you refer to by "that"?

By 0points 2025-06-037:436 reply

[flagged]

By illiac786 2025-06-039:241 reply

You put people in nice little drawers, the skeptics, and the non-skeptics. It is reductive and most of all, it’s polarizing. This is how US politics have become and we should avoid this here.

By luffy-taro 2025-06-0310:56

Yeah, putting labels on people is not very nice.

By Xmd5a 2025-06-038:353 reply

[flagged]

By jrvarela56 2025-06-0310:36

10 month old account talking like that to the village elder

By foldr 2025-06-0311:47

In fairness, the article is a lot more condescending and insulting to its readers than the comment you're replying to.

By rvnx 2025-06-038:406 reply

A LLM is essentially the world information packed into a very compact format. It is the modern equivalent of the Library of Alexandria.

Claiming that your own knowledge is better than all the compressed consensus of the books of the universe, is very optimistic.

If you are not sure about the result given by a LLM, it is your task as a human to cross-verify the information. The exact same way that information in books is not 100% accurate, and that Google results are not always telling the truth.

By fennecfoxy 2025-06-039:13

>LLM is essentially the world information packed into a very compact format.

No, it's world information distilled to various parts and details that training deemed important. Do not pretend for one second that it's not an incredibly lossy compression method, which is why LLMs hallucinate constantly.

This is why training is only useful for teaching the LLM how to string words together to convey hard data. That hard data should always be retrieved via RAG with an independent model/code verifying that the contents of the response are correct as per the hard data. Even 4o hallucinates constantly if it doesn't do a web search and sometimes even when it does.

By TheEdonian 2025-06-038:461 reply

Well let's not forget that it's an opinionated source. There is also the point that if you ask it about a topic it will (often) give you the answer that has the most content about it (or easiest to access information).

By illiac786 2025-06-039:162 reply

Agree.

I find that, for many, LLMs are addictive, a magnet, because it offers to do your work for you, or so it appears. Resisting this temptation is impossibly hard for children for example, and many adults succumb.

A good way to maintain a healthy dose of skepticism about its output and keep on checking this output, is asking the LLM about something that happened after the training cut off.

For example, I asked if lidar could damage phone lenses. And the LLM very convincingly argued it was highly improbable. Because that recently made the news as a danger for phone lenses, and wasn’t part of the training data.

This helps me stay sane and resist the temptation of just accepting LLM output =)

On a side note, the kagi assistant is nice for kids I feel because it links to its sources.

By dale_glass 2025-06-0310:541 reply

LIDAR damaging the lens is extremely unlikely. A lens is mostly glass.

What it can damage is the sensor, which is actually not at all the same thing as a lens.

When asking questions it's important to ask the right question.

By illiac786 2025-06-0311:25

Sorry, I meant the sensor

By criley2 2025-06-0311:521 reply

I asked ChatGPT o3 if lidar could damage phone sensors and it said yes https://chatgpt.com/share/683ee007-7338-800e-a6a4-cebc293c46...

I also asked Gemini 2.5 pro preview and it said yes. https://g.co/gemini/share/0aeded9b8220

I find it interesting to always test for myself when someone suggests to me that a "LLM" failed at a task.

By illiac786 2025-06-0312:111 reply

I should have been more specific, but you missed my point I believe.

I tested this at the time on Claude 3.7 sonnet, which have an earlier cut off date and I just tested again with this prompt: “Can the lidar of a self driving car damage a phone camera sensor?” and the answer is still wrong in my test.

I believe the issue is the training cut off date, that’s my point, LLM seem smart but they have limits and when asked about something discovered after training cut off date, they will sometimes confidently be wrong.

By criley2 2025-06-0312:461 reply

I didn't miss your point, rather I wanted you to realize some deeper points I was trying to make

- Not all LLM are the same, and not identifying your tool is problematic because "LLM's can't do a thing" is very different than "The particular model I used failed at this thing". I demonstrated that by showing that many LLMs get the answer right. It puts the onus of correctness entirely on the category of technology, and not the tool used or the skill of the tool user.

- Training data cutoffs are only one part of the equation: Tool use by LLM's allows them to search the internet and run arbitrary code (amongst many other things).

In both of my cases, the training data did not include the results either. Both used a tool call to search the internet for data.

Not realizing that modern AI tools are more than an LLM with training data, but rather have tool calling, full internet access, and can access and reason about a wide variety of up to date data sources demonstrates a fundamental misunderstanding about modern AI tools.

Having said that:

Claude Sonnet 4.0 says "yes": https://claude.ai/share/001e16f8-20ea-4941-a181-48311252bca0

Personally, I don't use Claude for this kind of thing because while it's proven to be a very good at being a coding assistant and interacting with my IDE in an "agentic" manner, it's clearly not designed to be a deep research assistant that broadly searches the internet and other data sources to provide accurate and up to date information. (This would mean that ai/model selection is a skill issue and getting good results from AI tools is a skill, which is borne out by the fact that I get the right answer every time I try, and you can't get the right answer once).

By illiac786 2025-06-0419:451 reply

Still not getting it I think.

My point is: LLMs sound very plausible and very confident when they are wrong.

That’s it. And I was just offering a trick to help remembering this, to keep checking their output – nothing else.

By ivape 2025-06-038:481 reply

This is pre-Covid HN thread on work from home:

https://news.ycombinator.com/item?id=22221507

It’s eerie. It’s historical. These threads from these past two years about what the future of AI will be will read like ghost stories. Like Rose having flash backs of the Titanic. It’s worth documenting. We honestly could be having the most ominous discussion of what’s to come.

We sit around and complain about dips in hiring, that’s nothing. The iceberg just hit. We’ve got 6 hours left.

By ignoramous 2025-06-039:03

> We sit around and complain about dips in hiring, that’s nothing. The iceberg just hit. We’ve got 6 hours left.

At least we've got hacker news to ourselves, have we not ... https://news.ycombinator.com/item?id=44130743

By Pamar 2025-06-039:214 reply

Partially OT:

Yesterday I asked Chat GPT which was the Japanese Twin City for Venice (Italy). This was just a quick offhand question because I needed the answer for a post on IG, so not exactly a death or life situation.

Answer: Kagoshima. It also added that the "twin status" was officially set in 1965, and that Kagoshima was the starting point for the Jesuit Missionary Alessandro Valignano in his attempt to proselitize Japanese people (to Catholicism, and also about European Culture).

I never heard of Kagoshima, so I googled for it. And discovered it is the twin city of Neaples :/

So I then googled for "Venice Japanese Twin City" and got: Hiroshima. I doublechecked this then I went back to ChatGPT and wrote:

"Kagoshima is the Twin City for Neaples.".

This triggered a websearch and finally it wrote back:

"You are right, Kagoshima is Twin City of Neaples since 1960."

Then it added "Regarding Venice instead, the twin city is Hiroshima, since 2023".

So yeah, a Library of Alexandria that you can count on as long as you have another couple of libraries to doublechek whatever you get from it. Note also that this was very straightforward question, there is nothing to "analyze" or "interpret" or "reason about". And yet the answer was completely wrong, the first date was incorrect even for Neaples (actually the ceremony was in May 1960) and the extra bits about Alessandro Valignano are not reported anywhere else: Valignano was indeed a Jesuit and he visited Japan multiple times, but Kagoshima is never mentioned when you google for him or if you check his wikipedia page.

You may understand how I remain quite skeptical for any application which I consider "more important than an IG title".

By rvnx 2025-06-039:271 reply

Claude 4 Opus:

> Venice, Italy does not appear to have a Japanese twin city or sister city. While several Japanese cities have earned the nickname "Venice of Japan" for their canal systems or waterfront architecture, there is no formal sister city relationship between Venice and any Japanese city that I could find in the available information

I think GPT-4o got it wrong in your case because it searched Bing, and then read only fragments of the page ( https://en.wikipedia.org/wiki/List_of_twin_towns_and_sister_... ) to save costs for processing "large" context

By Pamar 2025-06-0311:40

I am Italian, and I have some interest in Japanese history/culture.

So when I saw a completely unknown city I googled it up because I was wondering what it actually had in common with Venice (I mean, a Japanese version of Venice would be a cool place to visit next time I go to Japan, no?).

If I wanted to know, I dunno, "What is the Chinese Twin City for Buenos Aires" (to mention two countries I do not really know much about, and do not plan to visit in the future) should I trust the answer? Or should I go looking it up somewhere else? Or maybe ask someone from Argentina?

My point is that even as a "digital equivalent of the Library of Alexandria" LLM seem to be extremely unreliable. Therefore - at least for now - I am wary about using them for work, or for any other area where I really care for the quality of the result.

By richardw 2025-06-0310:40

If I want facts that I would expect the top 10 Google results to have, I turn search on. If I want a broader view of a well known area, I turn it off. Sometimes I do both and compare. I don’t rely on model training memory for facts that the internet wouldn’t have a lot of material for.

40 for quick. 40 plus search for facts. O4-mini high plus search for “mini deep research”, where it’ll hit more pages, structure and summarise.

And I still check the facts and sources to be honest. But it’s not valueless. I’ve searched an area for a year and then had deep research find things I haven’t.

By croemer 2025-06-0321:06

o3 totally nailed it first shot. Hiroshima since 2023. Provides authoritative source (Venetian city press release): https://chatgpt.com/share/683f638a-3ce0-8005-91d6-3eb1df9f19...

By meowface 2025-06-039:542 reply

What model?

People often say "I asked ChatGPT something and it was wrong", and then you ask them the model and they say "huh?"

The default model is 4.1o-mini, which is much worse than 4.1o and much much worse than o3 at many tasks.

By TeMPOraL 2025-06-0310:30

Yup. The difference is particularly apparent with o3, which does bursts of web searches on its own whenever it feels it'll be helpful in solving a problem, and uses the results to inform its own next steps (as opposed to just picking out parts to quote in a reply).

(It works surprisingly well, and feels mid-way between Perplexity's search and OpenAI's Deep Research.)

By Pamar 2025-06-0311:312 reply

I asked "What version/model are you running, atm" (I have a for-free account on OpenAI, what I have seen so far will not justify a 20$ monthly fee - IN MY CASE).

Answer: "gpt-4-turbo".

HTH.

By bavell 2025-06-0312:21

Don't ask the model, just look at the model selection drop-down (wherever that may be in your UI)

By meowface 2025-06-0312:44

>I have a for-free account on OpenAI, what I have seen so far will not justify a 20$ monthly fee - IN MY CASE

4.1o-mini definitely is not worth $20/month. o3 probably is (and is available in the $20/month plan) for many people.

By sethammons 2025-06-0311:121 reply

No, don't think libraries, think "the Internet."

The Internet thinks all kinds of things that are not true.

By mavhc 2025-06-0311:491 reply

Just like books then, except the internet can be updated

By rvnx 2025-06-0311:55

We all remember those teachers that said that internet cannot be trusted, and that only source of truth is in books.

By rsynnott 2025-06-039:221 reply

Even if this were true (it is not; that’s not how LLMs work), well, there was a lot of complete nonsense in the Library of Alexandria.

By rvnx 2025-06-039:581 reply

It's a compressed statistical representation of text patterns, so it is absolutely true. You lose information during the process, but the quality is similar to the source data. Sometimes even above, as there is consensus when information is repeated across multiple sources.

By brahma-dev 2025-06-0310:48

It's amazing how it goes from all the knowledge in the world to ** terms and conditions apply, all answers are subject to market risks, please read the offer documents carefully.........

By jgrahamc 2025-06-0310:131 reply

As someone who has followed Thomas' writing on HN for a long time... this is the funniest thing I've ever read here! You clearly have no idea about him at all.

By tptacek 2025-06-0318:561 reply

Especially coming from you I appreciate that impulse, but I had the experience of running across someone else the Internet (or Bsky, at least) believed I had no business not knowing about, and I did not enjoy it, so I'm now an activist for the cause of "people don't need to know who I am". I should have written more clearly above.

By jgrahamc 2025-06-048:29

That is a very good cause!

By exe34 2025-06-038:28

One would hope the experience leads to the position, and not vice-versa.

By rfrey 2025-06-0312:41

... you think tptacek has no expertise in cryptography?

By wickedsight 2025-06-038:281 reply

That is no different from pretty any other person in the world. If I interview people to catch them on mistakes, I will be able to do exactly that. Sure, there are some exceptions, like if you were to interview Linus about Linux. Other than that, you'll always be able to find a fluke in someone's knowledge.

None of this makes me 'snap out' of anything. Accepting that LLM's aren't perfect means you can just keep that in mind. For me, they're still a knowledge multiplier and they allow me to be more productive in many areas of life.

By tecleandor 2025-06-038:563 reply

Not at all. Useful or not, LLMs will almost never say "I don't know". They'll happily call a function to a library that never existed. They'll tell you "Incredible idea! You're on the correct path! And you can easily do that with so and so software", and you'll be like "wait what, that software doesn't do that", and they'll answer "Ah, yeah, you're right, of course."

By ignoramous 2025-06-039:081 reply

TFA says, hallucinations is why "gyms" will be important: Language tooling (compiler, linter, language server, domain-specific static analyses etc) that feed back into the Agent, so it'll know to redo.

By rvnx 2025-06-039:13

Sometimes asking in a loop: "are you sure ? think step-by-step", "are you sure ? think step-by-step", "are you sure ? think step-by-step", "are you sure ? think step-by-step", "verify the result" or similar, you may end up with "I'm sure yes", and then you know you have a quality answer.

By yujzgzc 2025-06-039:041 reply

No there are many techniques now to curb hallucinations. Not perfect but no longer so egregiously overconfident.

By ninkendo 2025-06-0310:21

…such as?

By rvnx 2025-06-039:11

The most infuriating are the emojis everywhere

By KoolKat23 2025-06-0311:18

That proves nothing with respect to the LLMs usefulness, all it means is that you are still useful.

By tuyiown 2025-06-039:172 reply

[flagged]

By sgt 2025-06-0311:082 reply

[flagged]

By rerdavies 2025-06-039:402 reply

And in the meantime, the people you are competing with in the job market have already become 2x more productive.

By johnecheck 2025-06-0312:38

Oh no! I'll get left behind! I can't miss out! I need to pay $100/month to an AI company or I'll be out of a job!

By continuational 2025-06-0316:251 reply

Hype hype hype! Data please.

By jf22 2025-06-0316:432 reply

What kind of "data" do you want? I'm submitting twice as many PRs with twice as much complexity. Changes that would take weeks take minutes. I generated 50 new endpoints for our API in a week. Last time we added 20 it took a month.

By continuational 2025-06-0317:481 reply

What kind of data would you want?

I'm 100x as productive now that I eat breakfast while reciting the declaration of independence.

You really should try it. Don't give up, it works if you do it just right.

Edit: I realize this line of argument doesn't really lead anywhere. I leave you with this Carl Sagan quote:

> Extraordinary claims require extraordinary evidence

(or perhaps we can just agree that claims require evidence, and not the anecdotal kind)

By jf22 2025-06-0318:121 reply

This is an immature bad faith interpretation.

The thing is there are loads of people making these claims and they aren't that extraordinary.

The claims for coding productivity increases were extraordinary 12 months ago, now they are fully realized and in use every day.

By continuational 2025-06-0320:102 reply

Please refrain from ad hominem attacks.

By jf22 2025-06-0320:22

I'm not sure which part of my comment used an aspect of you to undermine the credibility of your argument.

So many people get ad hominem wrong.

I'm intrigued that someone who made a bad faith comparison like you would be so aghast at an ad hominem.

By pxc 2025-06-040:061 reply

That's not what ad hominem means

By continuational 2025-06-044:022 reply

Calling someone "immature" can be considered an ad hominem attack if it is used as a way to dismiss or undermine their argument without addressing the substance of what they are saying.

By pxc 2025-06-0413:31

> Calling someone "immature"

That didn't actually happen in this thread, though. A comment was characterized as "immature". That might be insulting, but it's not an ad hominem attack.

Saying "this argument sucks" just isn't attacking an argument by attacking the credibility of the arguer. Using terms that can describe cognitive temperament or ability to directly characterize an argument might seem more ambiguous, but they don't really qualify either, for two reasons:

1. Humans are fallible, so it doesn't take an immature person to make an immature argument; all people are capable of becoming impatient or flippant. (The same is true for characterizations of arguments in terms of intelligence or wisdom, whatever.)

2. More crucially, even if you do read it as implying a personal insult, a direct assertion that an argument, claim, or comment is <insert insulting term here> is inverted from that of an ad hominem attack: the argument itself is asserted to evince the negative trait— rather than the negative trait, having been purportedly established about the person, being used to undermine the argument without any connection asserted between it and the argument other than that both it and the negative trait belong to the arguer.

You might call this, if you think the person saying it fails to show or suggest why the comment in question is "immature" or whatever a baseless insult, but it's not an ad hominem attack because that's just a different rhetorical strategy.

But it's pretty clear that it's more a form of scolding (and maybe even pleading with someone to "step up" in some sense) than something primarily intended to insult. Maybe that's useless, maybe it's needlessly insulting, maybe it involves an unargued assertion, but "ad hominem" just ain't what it is.

Fwiw, after writing all that, I kinda regret it even though I stand by the contents. Talk of fallacies here has the problems of being technical and impersonal, and perhaps invites tangents like this.

By jf22 2025-06-0413:49

I said your interpretation was immature, not you.

This is not a mature thing to say: " I eat breakfast while reciting the declaration of independence."

By vitaflo 2025-06-0317:163 reply

Unless you’re being paid on a fixed bid it doesn’t matter. Someone else is reaping the rewards of your productivity not you. You’re being paid the same as your half-as-productive coworkers.

By sokoloff 2025-06-0318:36

I'm a manager of software engineers. I have absolutely no intention of paying engineer B the same as half-as-productive engineer A.

Figuring out the best way to determine productivity is still a hard problem, but I think it's a category error to think that productivity gains go exclusively to the company.

If all (or even most) of the engineers on your team who were previously your equal become durably twice as productive and your productivity remains unchanged, your income prospects will go down, quite possibly to zero.

By jf22 2025-06-0318:10

I think it matters very much if the person next to me is 2x as productive and I'm not.

By continuational 2025-06-0318:10

Isn't this the case regardless of whether you use AI or not? Is this an argument for being less productive?

By dcow 2025-06-0313:223 reply

[flagged]

By bccdee 2025-06-0314:352 reply

Really? I feel like the article pointedly skirted my biggest complaint.

> ## but the code is shitty, like that of a junior developer

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

> Part of being a senior developer is making less-able coders productive, be they fleshly or algebraic. Using agents well is both a both a skill and an engineering project all its own, of prompts, indices, and (especially) tooling. LLMs only produce shitty code if you let them.

I hate pair-programming with junior devs. I hate it. I want to take the keyboard away from them and do it all myself, but I can't, or they'll never learn.

Why would I want a tool that replicates that experience without the benefit of actually helping anyone?

By jononor 2025-06-0319:44

You are helping the companies train better LLMs... Both by just paying for their expenses, but also they will use the training data. That may or may not be something one considers a worthwhile contribution. Certainly it is less valuable than helping a person grow their intellectual capacity.

By kylereeve 2025-06-0315:513 reply

> Does an intern cost $20/month? Because that’s what Cursor.ai costs.

This stuck out to me. How long will it continue to be so cheap? I would assume some of the low cost is subsidized by VC money which will dry up eventually. Am I wrong here?

By simonw 2025-06-0316:03

Prices have been dropping like a stone over the past two years, due to a combination of new efficiencies being developed for serving plus competition from many vendors: https://simonwillison.net/2024/Dec/31/llms-in-2024/#llm-pric...

I'm not seeing any evidence yet of that trend stopping or reversing.

By svachalek 2025-06-0317:38

Training frontier models is expensive. Running inference on them is pretty cheap and for solving the same level of problem, will continue to get cheaper. The more use they get out of a model, the less overhead we need to pay on that inference to subsidize the training.

By overflow897 2025-06-0316:17

At the rate they're going it'll just get cheaper. The cost per token continues to drop while the models get better. Hardware is also getting more specialized.

Maybe the current batch of startups will run out of money but the technology itself should only get cheaper.

By chinchilla2020 2025-06-0317:42

The article provides no solid evidence that "AI is working" for the author.

At the end of the day this article is nothing but another piece of conjecture on hackernews.

Actually assessing the usefulness of AI would require measurements and controls. Nothing has been proven or disproven here

By permo-w 2025-06-0313:27

the irony with AI sceptics is that their opinions usually sound like they've been stolen from someone else

By cesarb 2025-06-0223:2524 reply

This article does not touch on the thing which worries me the most with respect to LLMs: the dependence.

Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work. Whoever controls that system can arbitrarily raise the prices, subtly manipulate the outputs, store and do anything they want with the inputs, or even suddenly cease to operate. And since, according to this article, only the latest and greatest LLM is acceptable (and I've seen that exact same argument six months ago), running locally is not viable (I've seen, in a recent discussion, someone mention a home server with something like 384G of RAM just to run one LLM locally).

To those of us who like Free Software because of the freedom it gives us, this is a severe regression.

By aaron_m04 2025-06-0223:372 reply

Yes, and it's even worse: if you think LLMs may possibly make the world a worse place, you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better.

By MetaWhirledPeas 2025-06-0318:232 reply

> you should not use any LLMs you aren't self-hosting, because your usage information is being used by the creators to make LLMs better

This sounds a bit like bailing out the ocean.

By aaron_m04 2025-06-0320:07

> This sounds a bit like bailing out the ocean.

If it's one individual doing this, sure. I am posting this in the hopes that others follow suit.

By mrtesthah 2025-06-0322:06

so does voting.

By inadequatespace 2025-06-0313:141 reply

I think that’s a bit of a leap; if you think LLMs make the world a worse place, there are many actions that you might take or not take to try to address that.

By aaron_m04 2025-06-0320:081 reply

It's true that there could be other more impactful actions. I'd love to hear your thoughts on what else can be done.

By inadequatespace 2025-06-0714:54

To be fair, I don't have any great specific ideas, but "Work Without the Worker" for example talks about how a lot of LLMs are fueled by neo-colonialist exploitation.

So I guess broadly speaking there could be strategies involving attempting to influence governmental policy rather than by consumer choice.

Or more radically, trying to change the structure of the government in general such that the above influences actually are more tractable for the common person.

By eleveriven 2025-06-037:192 reply

It's also why local models, even if less powerful, are so important. The gap between "state of the art" and "good enough for a lot of workflows" is narrowing fast

By mplanchard 2025-06-0414:00

Yeah I am very excited for local models to get good enough to be properly useful. I’m a bit of an AI skeptic I’ll admit, but I’m much more of a SV venture-backed company skeptic. The idea of being heavily reliant on such a company, plus needing to be online, plus needing to pay money just to get some coding done is pretty unpalatable to me.

By dabockster 2025-06-0319:10

Especially with MCP programs that can run in Docker containers.

By dabockster 2025-06-0319:072 reply

You can get 90%+ of the way there with a tiny “coder” LLM running on the Ollama backend with an extension like RooCode and a ton of MCP tools.

In fact, MCP is so ground breaking that I consider it to be the actual meat and potatoes of coding AIs. Large models are too monolithic, and knowledge is forever changing. Better just to use a small 14b model (or even 8b in some cases!) with some MCP search tools, a good knowledge graph for memory, and a decent front end for everything. Let it teach itself based on the current context.

And all of that can run on an off the shelf $1k gaming computer from Costco. It’ll be super slow compared to a cloud system (like HDD vs SSD levels of slowness), but it will run in the first place and you’ll get *something* out of it.

By esaym 2025-06-0320:521 reply

Why don't you elaborate on your setup then?

By xandrius 2025-06-0322:13

Because you can look it easily up. Jan, gtp4all, etc.

It's not black magic anymore.

By macrolime 2025-06-0320:271 reply

Which MCPs do recommend?

By FinalDestiny 2025-06-0518:34

DesktopCommander and Taskmaster are great to start with. With just these, you may start to see why OP recommends a memory MCP too (I don’t have a recommendation for that yet)

By underdeserver 2025-06-035:035 reply

You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.

By simoncion 2025-06-035:313 reply

I'm not the OP but:

* Not even counting cellular data carriers, I have a choice of at least five ISPs in my area. And if things get really bad, I can go down to my local library to politely encamp myself and use their WiFi.

* I've personally no need for a cloud provider, but I've spent a lot of time working on cloud-agnostic stuff. All the major cloud providers (and many of the minors) provide compute, storage (whether block, object, or relational), and network ingress and egress. As long as you don't deliberately tie yourself to the vendor-specific stuff, you're free to choose among all available providers.

* I run Linux. Enough said.

By underdeserver 2025-06-0311:291 reply

* You might have a choice of carriers or ISPs, but many don't.

* Hmm, what kind of software do you write that pays your bills?

* And your setup doesn't require any external infrastructure to be kept up to date?

By simoncion 2025-06-070:36

> ...but many don't.

And many do. The US isn't the entire world, you know.

> ...what kind of software do you write that pays your bills?

B2B software that allows anyone to run their workloads with most any cloud provider, and most any on-prem "cloud". The entire point of this software is to abstract out the underlying infrastructure so that businesses can walk away from a particular vendor if that vendor gets too stroppy.

> ...your setup doesn't require any external infrastructure...

It's Gentoo Linux, so it runs largely on donated infra (and infra paid for with donations). But -unlike Windows or OS X users- if I get sick of what the Gentoo steering committee are doing, I can go to another distro (or just fucking roll my own should things get truly dire). That's the point of my comment.

By apitman 2025-06-0318:18

How about your web browser?

By Flemlo 2025-06-0313:551 reply

Just this week a library got deprecated.

Open source of course.

So what's my response to that deprecating? Maintaining it myself? Nope finding another library.

You always depend on something...

By edude03 2025-06-0314:031 reply

> Maintaining it myself?

You say that like it's an absurd idea, but in fact this is what most companies would do.

By Flemlo 2025-06-0314:58

I can maintain basic code no issue but not if it becomes complex or security relevant.

And I have worked in plenty of companies I'm the open source guy in these companies and me or my teams never had the capacity to do so

By ku1ik 2025-06-0315:351 reply

Well, you can’t really self-host your internet connection anyway :)

By rollcat 2025-06-0319:032 reply

Of course you can. It's called an AS (autonomous system), I think all you need is an IP address range, a physical link to someone willing to peer with you (another AS), some hardware, some paperwork, etc; and bam you're your own ISP.

My company has set this up for one of our customers (I wasn't involved).

By computably 2025-06-0421:02

> all you need is an IP address range, a physical link to someone willing to peer with you (another AS), some hardware, some paperwork, etc; and bam you're your own ISP.

I'm pretty sure the connotation of "self-host" entails a strictly substantially smaller scope than starting your own ISP.

Finding someone willing to peer with you also defeats the purpose. You are still fundamentally dependent on established ISPs.

By sieabahlpark 2025-06-0419:46

[dead]

By ang_cire 2025-06-0522:07

This is why I run a set of rackmount servers at home, that have the media and apps that I want to consume. If my ISP bites the dust tomorrow, I've literally got years worth of music, books, tv, movies, etc. Hell, I even have a bunch of models on ollama, and an offline copy of wikipedia running (minus media, obv) via kiwix.

It's not off-grid, but that's the eventual dream/ goal.

By EFreethought 2025-06-0314:25

> You can also make this argument to varying degrees about your internet connection, cloud provider, OS vendor, etc.

True, but I think wanting to avoid yet another dependency is a good thing.

By 0x1ceb00da 2025-06-0316:49

... search engine

By 0j 2025-06-0318:411 reply

I don't feel like being dependent on LLM coding tools is much of an issue, you can very easily switch between different vendors. And I hope that open weight models will be "good enough" until we get a monopoly. In any case, even if you are afraid of getting too dependent on AI tools, I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months

By mplanchard 2025-06-0414:02

> I think everyone needs to stay up to date on what is happening. Things are changing very quickly right now, so no matter what argument you may have against LLMs, it may just not be valid any more in a few months

This actually to me implies the opposite of what you’re saying here. Why bother relearning the state of the art every few months, versus waiting for things to stabilize on a set of easy-to-use tools?

By rsanheim 2025-06-0319:41

We will have the equivalent of Claude Sonnet 4 in a local LLM that can run well on a modern Mac w/ 36+ GB of ram in a year or two. Maybe faster. The local/open models are developing very fast in terms of quantization and how well they can run on consumer hardware.

Folks that are local LLMs everyday now will probably say you can basically emulate at least Sonnet 3.7 for coding if you have an real AI workstation. Which may be true, but the time and effort and cost involved is substantial.

By underdeserver 2025-06-035:003 reply

Good thing it's a competitive market with at least 5 serious, independent players.

By nosianu 2025-06-039:24

That will work until there has been a lot of infrastructure created to work with a particular player, and 3rd party software.

See the Microsoft ecosystem as an example. Nothing they do could not be replicated, but the network effects they achieved are strong. Too much glue, and 3rd party systems, and also training, and what users are used to, and what workers you could hire are used to, now all point to the MS ecosystem.

In this early mass-AI-use phase you still can easily switch vendors, sure. Just like in the 1980s you could still choose some other OS or office suite (like Star Office - the basis for OpenOffice, Lotus, WordStar, WordPerfect) without paying that kind of ecosystem cost, because it did not exist yet.

Today too much infrastructure and software relies on the systems from one particular company to change easily, even if the competition were able to provide a better piece of software in one area.

By shaky-carrousel 2025-06-0311:19

Until they all merge, or form a cartel.

By rpigab 2025-06-0314:29

Good thing it's funded by generous investors or groups who are okay with losing money on every sale (they'll make it up in volume), and never stop funding, and never raise prices, insert ads or enshittify.

By rvnx 2025-06-039:242 reply

With the Mac Studio you get 512 GB of unified memory (shared between CPU and GPU), this is enough to run some exciting models.

In 20 years, memory has doubled 32x

It means that we could have 16 TB memory computers in 2045.

It can unlock a lot of possibilities. If even 1 TB is not enough by then (better architecture, more compact representation of data, etc).

By fennecbutt 2025-06-039:411 reply

Yeah, for £10,000. And you get 512GB of bandwidth starved memory.

Still, I suppose that's better than what nvidia has on offer atm (even if a rack of gpus gives you much, much higher memory throughput).

By theshrike79 2025-06-0310:332 reply

AKCSHUALLY the M-series CPU memory upgrades are expensive because the memory is on-chip and the bandwidth is a lot bigger than on comparable PC hardware.

In some cases it's more cost effective to get M-series Mac Minis vs nVidia GPUs

By lolinder 2025-06-0311:56

They know that, but all accounts I've read acknowledge that the unified memory is worse than dedicated VRAM. It's just much better than running LLMs on CPU and the only way for a regular consumer to get to 64GB+ of graphical memory.

By hu3 2025-06-0314:471 reply

It's still magnitude slower than AI GPUs.

And with $10k I could pay 40 years of Claude subscription. A much smarter and faster model.

By onemoresoop 2025-06-0316:45

Claude subscription is $20 but is that all you're using? If you want to get the best of Claude it'll cost you a few hundred+ a month.

By hajile 2025-06-040:05

Memory scaling has all but stopped. Current RAM cells are made up of just 40,000 or so electrons (that's when it's first stored. It degrades from there until refreshed). Going smaller is almost impossible due to physics, noise, and the problem of needing to amplify that tiny charge to something usable.

For the past few years, we've been "getting smaller" by getting deeper. The diameter of the cell shrinks, but the depth of the cell goes up. As you can imagine, that doesn't scale very well. Cutting the cylinder diameter in half doubles the depth of the cylinder for the same volume.

If you try to put the cells closer together, you start to get quantum tunneling where electrons would disappear from one cell and appear in another cell altering charges in unexpected ways.

The times of massive memory shrinks are over. That means we have to reduce production costs and have more chips per computer or find a new kind of memory that is mass producible.

By Hilift 2025-06-030:15

That's going full speed ahead though. Every major cloud provider has an AI offering, and there are now multiple AI-centric cloud providers. There is a lot of money and speculation. Now Nvidia has their own cloud offering that "democratize access to world-class AI infrastructure. Sovereign AI initiatives require a new standard for transparency and performance".

By amadeuspagel 2025-06-0310:443 reply

I can't run google on my computer on my own, but I'm totally dependent on it.

By _heimdall 2025-06-0311:582 reply

Is your entire job returning google results?

The point being made here is that a developer that can only do their primary job of coding via a hosted LLM is entirely dependent on a third party.

By scotty79 2025-06-0314:563 reply

How much of useful programming work are you able to do without google? I don't think I even tried to do do any for the last 20 years.

You make a good point of course that independence is important. But primo, this ship sailed long ago, secundo, more than one party provides the service you depend on. If one failes you still have at least some alternatives.

By ARandumGuy 2025-06-0315:33

I guess it depends on how you define "using google", but as I've progressed as a developer, I've found myself spending less time googling problems, and more time just looking at official documentation (maybe GitHub issues if I'm doing something weird). And yeah, I guess I technically use google to get to the documentation instead of typing in the URL, but that feels like splitting hairs.

And it's not like people weren't able to develop complicated software before the internet. They just had big documentation books that cost money and could get dated quickly. To be clear, having that same info a quick google search away is an improvement, and I'm not going to stop using google while it's available to me. But that doesn't mean we'd all be screwed if google stopped existing tomorrow.

By horsawlarway 2025-06-0315:181 reply

Takes a little adjustment, but you can do quite a bit of good work without Google (or any search).

Spoken from a fair bit of experience doing software development in closed rooms with strict control of all digital devices (from your phone to your watch) and absolutely no external connections.

There are moments that are painful still, because you'll be trying to find a thing in a manual and you know a search can get it faster - but it's silly to imply this isn't possible.

By scotty79 2025-06-0315:31

I know it's possible with proper preparation. But I never had the strong need to prepare.

By _heimdall 2025-06-0315:09

I can't say I've ever tried to intentionally not use google when working, unless I'm wanting to work offline entirely.

That said I only find google results somewhat helpful. Its a lot like LLM code (not surprising given how they're trained), I may find 5 answers online and one or two has a small piece of what I need. Ultimately that may say me a bit of time or give me an idea for something I hadn't thought of, but it isn't core to my daily work by any stretch.

By a_wild_dandan 2025-06-0315:111 reply

Which developer jobs aren't dependent on many third parties?

By _heimdall 2025-06-0319:13

There's a difference in being dependent on parts of the job versus the entire job.

I mostly write JS today and it either runs in browsers (dependencies) or a host like AAwS (dependencies). I use VS Codium and a handful of plugins (dependencies).

These all help me work efficiently when I'm coding, or help me avoid infrastructure issues that I don't want to deal with. Any one part is replaceable though, and more importantly any one part isn't responsible for doing my entire job of creating and shipping code.

By whobre 2025-06-0312:231 reply

I did code before Google, and I was fine. Yes, it's really convenient, and LLM would be even more convenient if I could trust it just a little bit more, but it's quite possible to do some effective software development without Google.

By teeray 2025-06-0315:03

In 8th grade, I had a little PHP 4 pocket reference book. In classes I didn’t care about, I would have this open inside the book for that class and I would write code for my forums on loose leaf (in a shorthand). I also had printed copies of Mercury Board source code to refer to in the back of my binder. Then I’d get home later, type it in, debug a bit, and have new features :) It’s an entirely alien analog process to modern students, I’m sure, but it was really effective!

By zelphirkalt 2025-06-0310:561 reply

There are many alternatives though. It is not like Google has a search monopoly or office product monopoly, or e-mail provider monopoly. It is quite possible to cut out a lot of Google from one's life, and not even complicated to do that.

By pkilgore 2025-06-0311:541 reply

Is your argument there are no LLM alternatives?

By zelphirkalt 2025-06-0321:49

Not really, no. Though I would argue if Google disappeared tomorrow, as a private person you would probably do mostly fine. The point being, that your dependence is most likely not that strong actually. Unless you got important mail arriving at only your gmail mailbox. That would be dangerous. I lost several good accounts on other websites that way in the past. Now I don't register anything useful on gmail addresses any longer, in fact don't actually use gmail any longer, unless I still have some old accounts that I still didn't migrate away out of laziness.

By 79a6ed87 2025-06-030:58

>To those of us who like Free Software because of the freedom it gives us, this is a severe regression.

It's fair to be worried about depending on LLM. But I find the dependance on things like AWS or Azure more problematic, if we are talking about centralized and proprietary

By Aeolun 2025-06-033:112 reply

It's not like the code is suddenly elsewhere right? If the LLM disappears I'll be annoyed, not helpless.

By nessbot 2025-06-033:33

Not if they only way you know how to code is vibe coding.

By brailsafe 2025-06-0311:00

Well, I'd think of it like being car-dependent. Sure, plenty of suburbanites know how to walk, they still have feet, but they live somewhere that's designed to only be practically traversable by car. While you've lived that lifestyle, you may have gained weight and lost muscle mass, or developed an intolerance for discomfort to a point where it poses real problems. If you never got a car, or let yourself adapt to life without one, you have to work backwards from that constraint. Likewise with the built environment around us; the cities many people under the age of 40 consider to be "good" are the ones that didn't demolish themselves in the name of highways and automobiles, in which a car only rarely presents what we'd think of as useful technology.

There are all kinds of trades that the car person and the non-car person makes for better or worse depending on the circumstance. The non-car person may miss out on a hobby, or not know why road trips are neat, but they don't have the massive physical and financial liabilities that come with them. The car person meanwhile—in addition to the aforementioned issues—might forget how to grocery shop in smaller quantities, or engage with people out in the world because they just go from point A to B in their private vessel, but they may theoretically engage in more distant varied activities that the non-car person would have to plan for further in advance.

Taking the analogy a step further, each party gradually sets different standards for themselves that push the two archetypes into diametrically opposed positions. The non-car owner's life doesn't just not depend on cars, but is often actively made worse by their presence. For the car person, the presence of people, especially those who don't use a car, gradually becomes over-stimulating; cyclists feel like an imposition, people walking around could attack at any moment, even other cars become the enemy. I once knew someone who'd spent his whole life commuting by car, and when he took a new job downtown, had to confront the reality that not only had he never taken the train, he'd become afraid of taking it.

In this sense, the rise of LLM does remind of the rise of frontend frameworks, bootcamps thay started with React or React Native, high level languages, and even things like having great internet; the only people who ask what happens in a less ideal case are the ones who've either dealt with those constraints first-hand, or have tried to simulate it. If you've never been to the countryside, or a forest, or a hotel, you might never consider how your product responds in a poor connectivity environment, and these are the people who wind up getting lost on basic hiking trails having assumed that their online map would produce relevant information and always be there.

Edit: To clarify, in the analogy, it's clear that cars are not intrinsically bad tools or worthwhile inventions, but had excitement for them been tempered during their rise in commodification and popularity, the feedback loops that ended up all but forcing people to use them in certain regions could have been broken more easily.

By keutoi 2025-06-034:311 reply

I think the same argument could be made about search engines. Most people are not too worried about them.

By thayne 2025-06-035:091 reply

Maybe they should be.

By rerdavies 2025-06-0310:261 reply

You could stop using them, I suppose.

By schaefer 2025-06-0317:04

"Say, have you seen any good websites lately?" Is a conversation I've literally never started.

And it feels strange, because I am constantly asking people what books they're reading.

By marcofloriano 2025-06-0617:47

Best observation so far. Specially the cost side of using all those APIs ... i pay in dollars, but earn in reais (brazil), the cost scares me.

By benced 2025-06-0319:06

You can run LLMs locally pretty easily, especially if you have a Mac (the unified memory architecture of Macs is really good at this). It's a niche thing but caring about Free Software is niche.

By rco8786 2025-06-0320:31

> Unless you can run the LLM locally, on a computer you own, you are now completely dependent on a remote centralized system to do your work.

To be fair, the entire internet is basically this already.

By sanex 2025-06-0322:26

You think an LLM provider has a bigger moat than an IDE (say pre vs code for a better parallel). MSDN and Jetbrains licenses are far more expensive than Cursor or Windsurf.

By nkotov 2025-06-0317:05

The truth is that majority of people do not care about this. It's why AWS exists. It's why Fly.io exists.

By ImaCake 2025-06-0223:29

>the dependence.

Sure, but that is not the point of the article. LLMs are useful. The fact that you are dependent on someone else is a different problem like being dependent on microsoft for your office suite.

By Flemlo 2025-06-0313:481 reply

I think 384gb of ram is surprisingly reasonable tbh.

200-300$/month are already 7k in 3 years.

And I do expect some hardware chip based models in a few years like a GPU.

AiPU we're you can replace the hardware ai chip.

By BoiledCabbage 2025-06-0316:493 reply

> I think 384gb of ram is surprisingly reasonable tbh.

> 200-300$/month are already 7k in 3 years.

Except at current crazy rates of improvement, cloud based models will in reality likely be ~50x better, and you'll still have the same system.

By simonw 2025-06-0317:43

I've had the same system (M2 64GB MacBook Pro) for three years.

2.5 years ago it could just about run LLaMA 1, and that model sucked.

Today it can run Mistral Small 3.1, Gemma 3 27B, Llama 3.3 70B - same exact hardware, but those models are competitive with the best available cloud-hosted model from two years ago (GPT-4).

The best hosted models (o3, Claude 4, Gemini 2.5 etc) are still way better than the best models I can run on my 3-year-old laptop, but the rate of improvements for those local models (on the same system) has been truly incredible.

By Flemlo 2025-06-0317:34

I'm surprised that it's even possible running big models locally.

I agree we will see how this plays out but I hope models might start to become more efficient and it might not matter that much for certain things to run some parts locally.

I could imagine a LLM model with a lot less languages and optimized for one programming language to happen. Like 'generaten your model'

By bravesoul2 2025-06-0321:07

Yes LLMs are a funny workload. They require high amounts if processing but are very bursty.

Therefore using your own bare metal is a low of expensive redundancy.

For the cloud provider they can utilise the GPU to make it pay. They can also subsidise it with VC money :)

By imhoguy 2025-06-0310:502 reply

Even FOSS-based development depends on walled gardens, it is evident every time when GitHub is down.

By zelphirkalt 2025-06-0310:53

Sensibly hosted FOSS doesn't go to GitHub for hosting though. There are other options for people who care. I personally like Codeberg.

By neop1x 2025-06-0312:461 reply

IMO Github doesn't matter for FOSS because you have a lot of local clones, it won't disappear forever if Github goes down or deletes the repo there. Self-hosted alts are not 100% up either. And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.

By skydhash 2025-06-0320:43

> And I actually find collaboration functions / easy PR contribution on Github highly beneficial. At the same time I hate the friction of all those private Gitlabs, Giteas or, God forbid, gitweb.

FOSS is more about:

1. Finding some software you can use for your problem

2. Have an issue for your particular use case

3. Download the code and fix the issue.

4. Cleanup the patch and send a proposal to the maintainer. PR is easy, but email is ok. You can even use a pastebin service and post it on a forum (suckless does that in part).

5. The maintainer merges the patch and you can revert to the official version, or they don't and you decides to go with your fork.

By peab 2025-06-0320:04

do you use GitHub? Vs code? GCP or AWS? The internet, perhaps?All work is dependent on other services, that's the modern world

By mrheosuper 2025-06-032:512 reply

I disagree.

Self-hosting has always have a lot of drawbacks compared with commercial solutions. I bet my self-host file server has worse reliability than Google Drive, or my self-host git server has worse number of concurrent user than github.

It's one thing you must accept when self-host.

So when you self-host LLM, you must either accept a drop in output quality, or spend a small fortune on hardware

By kortilla 2025-06-033:28

Those aren’t good analogies because it costs nearly nothing to make that availability tradeoff and run things on your computer for your own fun.

Raspberry pi was a huge step forward, the move to LLMs is two steps back.

By wiseowise 2025-06-036:103 reply

Wake up, you’re already dependent on everything, unless you stick exclusively to Python std and no outside batteries.

Maven central is gone and you have no proxy setup or your local cache is busted? Poof, you’re fucking gone, all your Springs, Daggers, Quarkuses and every third party crap that makes up your program is gone. Same applies to bazillion JS, Rust libraries.

By pxnicksm 2025-06-037:06

There are multiple organizations with mirrors for packages, and I doubt if the cost of a mirror is the same as a cost of 384GB memory server.

A guy says here you need 4TB for a PyPi mirror, 285 GB for npm

https://stackoverflow.com/questions/65995150/is-it-possible-...

By wolvesechoes 2025-06-037:391 reply

If PyPI goes out and I cannot use NumPy, I can still roll-out my own implementation of linear algebra library, because I've got the required knowledge, and I've got it because I had to learn it instead rely on LLMs.

By roywiggins 2025-06-0314:01

I definitely couldn't, because I just use NumPy. Maybe I could roll my own matrix multiplication, but anything more would require cracking open a textbook that I haven't looked at for a decade. And this was true before I touched an LLM.

By miloignis 2025-06-037:43

Panamax works great for mirroring all of crates.io in 300-400GB, which is big but easily small enough for enthusiasts. I've got it on an external USB drive myself, and it's saved my bacon a few times.

We're not yet to that same point for performance of local LLM models afaict, though I do enjoy messing around them.

By gdubs 2025-06-0221:1819 reply

One thing that I find truly amazing is just the simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return. Like, as someone who grew up learning to code in the 90s it always seemed like science fiction that we'd get to a point where you could give a computer some vague human level instructions and get it more or less do what you want.

By forgotoldacc 2025-06-032:328 reply

There's the old quote from Babbage:

> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.

This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.

It is kind of impressive how I'll ask for some code in the dumbest, vaguest, sometimes even wrong way, but so long as I have the proper context built up, I can get something pretty close to what I actually wanted. Though I still have problems where I can ask as precisely as possible and get things not even close to what I'm looking for.

By kibwen 2025-06-0312:582 reply

> This has been an obviously absurd question for two centuries now. Turns out the people asking that question were just visionaries ahead of their time.

This is not the point of that Babbage quote, and no, LLMs have not solved it, because it cannot be solved, because "garbage in, garbage out" is a fundamental observation of the limits of logic itself, having more to with the laws of thermodynamics than it does with programming. The output of a logical process cannot be more accurate than the inputs to that process; you cannot conjure information out of the ether. The LLM isn't the logical process in this analogy, it's one of the inputs.

By rcxdude 2025-06-0313:182 reply

At a fundamental level, yes, and even in human-to-human interaction this kind of thing happens all the time. The difference is that humans are generally quite good at resolving most ambiguities and contradictions in a request correctly and implicitly (sometimes surprisingly bad at doing so explicitly!). Which is why human language tends to be more flexible and expressive than programming languages (but bad at precision). LLMs basically can do some of the same thing, so you don't need to specify all the 'obvious' implicit details.

By kibwen 2025-06-0313:571 reply

The Babbage anecdote isn't about ambiguous inputs, it's about wrong inputs. Imagine wanting to know the answer to 2+2, so you go up to the machine and ask "What is 3+3?", expecting that it will tell you what 2+2 is.

Adding an LLM as input to this process (along with an implicit acknowledgement that you're uncertain about your inputs) might produce a response "Are you sure you didn't mean to ask what 2+2 is?", but that's because the LLM is a big ball of likelihoods and it's more common to ask for 2+2 than for 3+3. But it's not magic; the LLM cannot operate on information that it was not given, rather it's that a lot of the information that it has was given to it during training. It's no more a breakthrough of fundamental logic than Google showing you results for "air fryer" when you type in "air frier".

By simonask 2025-06-0315:491 reply

I think the point they’re making is that computers have traditionally operated with an extremely low tolerance for errors in the input, where even minor ambiguities that are trivially resolved by humans by inferring from context can cause vastly wrong results.

We’ve added context, and that feels a bit like magic coming from the old ways. But the point isn’t that there is suddenly something magical, but rather that the capacity for deciphering complicated context clues is suddenly there.

By skydhash 2025-06-0320:58

> computers have traditionally operated with an extremely low tolerance for errors in the input

That's because someone have gone out of their way to mark those inputs as errors because they make no sense. The CPU itself has no qualms doing 'A' + 10 because what it's actually sees is a request is 01000001 (65) as 00001010 (10) as the input for its 8 bit adder circuit. Which will output 01001011 (75) which will be displayed as 75 or 'k' or whatever depending on the code afterwards. But generally, the operation is nonsense, so someone will mark it as an error somewhere.

So errors are a way to let you know that what you're asking is nonsense according to the rules of the software. Like removing a file you do not own. Or accessing a web page that does not exists. But as you've said, we can now rely on more accurate heuristics to propose alternatives solution. But the issue is when the machine goes off and actually compute the wrong information.

By Kye 2025-06-0313:43

Handing an LLM a file and asking it to extract data out of it with no further context or explanation of what I'm looking for with good results does feel a bit like the future. I still do add context just to get more consistent results, but it's neat that LLMs handle fuzzy queries as well as they do.

By make3 2025-06-0616:22

in this case the LLM uses context clues and commonality priors to find the closest correct input, which is definitely relevant

By CobrastanJorji 2025-06-033:441 reply

We wanted to check the clock at the wrong time but read the correct time. Since a broken clock is right twice a day, we broke the clock, which solves our problem some of the time!

By pca006132 2025-06-0311:222 reply

The nice thing is that a fully broken clock is accurate more often than a slightly deviated clock.

By antifa 2025-06-0415:24

A clock that's 5 seconds, 5 minutes, or 5 hours ahead, or counts an hour as 61 minutes, is still more useful than a clock that does not move it's hands at all.

By teddyh 2025-06-0320:44

Only if the deviated clock is fast. If a clock is, instead, slow, it is correct more often than a stopped clock.

By meowface 2025-06-034:26

It is fun to watch. I've sometimes indeed seen the LLM say something like "I'm assuming you meant [X]".

By nitwit005 2025-06-035:20

It's very impressive that I can type misheard song lyrics into Google, and yet still have the right song pop up.

But, having taken a chance to look at the raw queries people type into apps, I'm afraid neither machine nor human is going to make sense of a lot of it.

By CrimsonRain 2025-06-0312:11

theseday,s i ofen donot correct my typos even wheni notice them while cahtting with LLMS. So far 0 issues.

By ivape 2025-06-038:05

We're talking about God function.

function God (any param you can think of) {

}

By jajko 2025-06-0317:19

Well, you can enter 4-5 relatively vague keywords into google and first/second stackoverflow link will probably provide plenty of relevant code. Given that, its much less impressive since >95% of the problems and queries just keep repeating.

By godelski 2025-06-034:523 reply

How do you know the code is right?

By fsloth 2025-06-037:011 reply

The program behaves as you want.

No, really - there is tons of potentially value-adding code that can be of throwaway quality just as long as it’s zero effort to write it.

Design explorations, refactorings, erc etc.

By godelski 2025-06-039:363 reply

And how do you know it behaves like you want?

This is a really hard problem when I write every line and have the whole call graph in my head. I have no clue how you think this gets easier by knowing less about the code

By theshrike79 2025-06-0310:351 reply

Tests pretty much. Not a silver bullet for everything, but works for many cases.

Unless you're a 0.1% coder, your mental call graph can't handle every corner case perfectly anyway, so you need tests too.

By godelski 2025-06-0317:06

No one is saying you shouldn't write tests. But we are saying TDD is dumb.

Actually, for exactly the reasons you mention: I'm not dumb enough to believe I'm a genius. I'll always miss something. So I can't rely on my tests to ensure correctness. It takes deeper thought and careful design.

By fsloth 2025-06-0311:041 reply

By using the program? Mind you this works only for _personal_ tools where it’s intuitively obvious when something is wrong.

For example

”Please create a viewer for geojson where i can select individual feature polygons and then have button ’export’ that exports the selected features to a new geojson”

1. You run it 2. It shows the json and visualizes selections 3. The exported subset looks good

I have no idea how anyone could keep the callgraph of even a minimal gui application in their head. If you can then congratulations, not all of us can!

By godelski 2025-06-0322:211 reply

Great, I used my program and everything seems to be working as expected.

Not great, somebody else used my program and they got root on my server...

  > I have no idea how anyone could keep the callgraph of even a minimal gui application in their head

Practice.

Lots and lots of practice.

Write it down. Do things the hard way. Build the diagrams by hand and make sure you know what's going on. Trace programs. Pull out the debugger! Pull out the profiler!

If you do those things, you too will gain that skill. Obviously you can't do this for a giant program but it is all about the resolution of your call graph anyways.

If you are junior, this is the most important time to put in that work. You will get far more from it than you lose. If you're further along, well the second best time to plant a tree is today.

By fsloth 2025-06-048:46

”not great, somebody else used my program and they got root on my server...”

In general security sensitive software is the worst place possible to use LLM:s based on public case studies and anecdata exactly for this reason.

”Do it the hard way”

Yes that’s generally the way I do it as well when I need to reliably understand something but it takes hours.

The cadence with LLM driven experiments is usually under an hour. That’s the biggest boom for me - I get a new tool and can focus on the actual work I’m delivering, with some step now taking slightly less time.

For example I’m happy using vim without ever having read the code or debugged it, much less having observed it’s callgraph. I’m similarly content in using LLM generated utilities without much oversight. I would never push code like that to production of course.

By etherealG 2025-06-0415:101 reply

how do you know what you want if you didn't write a test for it?

I'm afraid what you want is often totally unclear until you start to use a program and realize that what you want is either what the program is doing, or it isn't and you change the program.

MANY programs are made this way, I would argue all of them actually. Some of the behaviour of the program wasn't imagined by the person making it, yet it is inside the code... it is discovered, as bugs, as hidden features, etc.

Why are programmers so obsessed that not knowing every part of the way a program runs means we can't use the program? I would argue you already don't, or you are writing programs that are so fundamentally trivial as to be useless anyway.

LLM written code is just a new abstraction layer, like Python, C, Assembly and Machine Code before it... the prompts are now the code. Get over it.

By godelski 2025-06-0420:28

  > how do you know what you want if you didn't write a test for it?

You have that backwards.

How do you know what to test if you don't know what you want?

I agree with you though, you don't always know what you want when you set out. You can't just factorize your larger goal into unit tests. That's my entire point.

You factorize by exploration. By play. By "fuck around and find out". You have to discover the factorization.

And that, is a very different paradigm than TDD. Both will end with tests, and frankly, the non TDD paradigm will likely end up with more tests with better coverage.

  > Why are programmers so obsessed that not knowing every part of the way a program runs means we can't use the program?

I think you misunderstand. I want to compare it to something else. There's a common saying "don't let perfection be the enemy of good (enough)". I think it captures what you're getting at, or is close enough.

The problem with that saying is that most people don't believe in perfection[0]. The problem is, perfection doesn't exist. So the saying ends up being a lazy thought terminator instead of addressing the real problem: determining what is good enough.

In fact, no one knows every part of even a trivial program. We can always introduce more depth and complexity until we reach the limits of our physics models and so no one knows. Therefore, you'll have to reason it is not about perfection.

I think you are forgetting why we program in the first place. Why we don't just use natural language. It's the same reason we use math in science. Not because math is the language of the universe but rather that math provides enough specificity to be very useful in describing the universe.

This isn't about abstraction. This is about specification.

It's the same problem with where you started. The customer can't tell my boss their exact requirements and my boss can't perfectly communicate to me. Someone somewhere needs to know a fair amount of details and that someone needs to be very trustworthy.

I'll get over it when the alignment problem is solved to a satisfactory degree. Perfection isn't needed, we will have you discuss what is good enough and what is not

[0] likely juniors. And it should be beat out of them. Kindly

By ic_fly2 2025-06-034:572 reply

The LLM generated unit tests pass. Obviously!

By godelski 2025-06-0317:08

It seems most people are making this answer but without the sarcasm...

By lazide 2025-06-036:18

Just don’t look at the generated unit tests, and we’re fine.

By dkdbejwi383 2025-06-036:591 reply

If customers don’t complain it must be working

By godelski 2025-06-039:41

You don't hear the complaints. That's different than no complaints. Trust me, they got them.

I got plenty of complaints for Apple, Google, Netflix, and everyone else. Shit that can be fixed with just a fucking regex. Here's an example: my gf is duplicated in my Apple contacts. It can't find the duplicate, despite same name, nickname, phone number, email, and birthday. Which there's three entries on my calendar for her birthday. Guess what happened when I manually merged? She now has 4(!!!!!) entries!! How the fuck does that increase!

Trust me, they complain, you just don't listen

By csallen 2025-06-0221:227 reply

It's mind blowing. At least 1-2x/week I find myself shocked that this is the reality we live in

By malfist 2025-06-0221:4517 reply

Today I had a dentist appointment and the dentist suggested I switch toothpaste lines to see if something else works for my sensitivity better.

I am predisposed to canker sores and if I use a toothpaste with SLS in it I'll get them. But a lot of the SLS free toothpastes are new age hippy stuff and is also fluoride free.

I went to chatgpt and asked it to suggest a toothpaste that was both SLS free and had fluoride. Pretty simple ask right?

It came back with two suggestions. It's top suggestion had SLS, it's backup suggestion lacked fluoride.

Yes, it is mind blowing the world we live in. Executives want to turn our code bases over to these tools

By Game_Ender 2025-06-0223:496 reply

What model and query did you use? I used the prompt "find me a toothpaste that is both SLS free and has fluoride" and both GPT-4o [0] and o4-mini-high [1] gave me correct first answers. The 4o answer used the newish "show products inline" feature which made it easier to jump to each product and check it out (I am putting aside my fear this feature will end up kill their web product with monetization).

0 - https://chatgpt.com/share/683e3807-0bf8-800a-8bab-5089e4af51...

1 - https://chatgpt.com/share/683e3558-6738-800a-a8fb-3adc20b69d...

By wkat4242 2025-06-032:202 reply

The problem is the same prompt will yield good results one time and bad results another. The "get better at prompting" is often just an excuse for AI hallucination. Better prompting can help but often it's totally fine, the tech is just not there yet.

By Workaccount2 2025-06-034:521 reply

While this is true, I have seen this happen enough times to confidently bet all my money that OP will not return and post a link to their incorrect ChatGPT response.

Seemingly basic asks that LLMs consistently get wrong have lots of value to people because they serve as good knowledge/functionality tests.

By malfist 2025-06-0316:401 reply

I don't have to post my chat, someone else already posted a chat claiming ChatGPT gave them correct answers when the answers ChatGPT gave them were all kinds of wrong.

See: https://news.ycombinator.com/item?id=44164633 and my analysis of the results: https://news.ycombinator.com/item?id=44171575

You can send me all your money via paypal, money order or check.

By Workaccount2 2025-06-0317:52

I'm not gonna go all out, this thread is gonna be dead soon but here all the toothpastes ChatGPT was referring to

[1]https://dentalhealth.com/products/fluoridex-sensitivity-reli...

[2]https://www.fireflysupply.com/products/hello-naturally-white...

[3]https://dailymed.nlm.nih.gov/dailymed/fda/fdaDrugXsl.cfm?set...

(Seems toms recently discontinued this, they mention it on their website, but say customers didn't like it)

[4]https://www.jason-personalcare.com/product/sea-fresh-anti-ca...

As far as I can tell these are all real products and all meet the requirement of having fluoride and being SLS free.

Since you did return however and that was half my bet, I suppose you are still entitled to half my life savings. But the amount is small so maybe the knowledge of these new toothpastes is more valuable to you anyway.

By Aeolun 2025-06-033:024 reply

If you want a correct answer the first time around, and give up if you don't get it, even if you know the thing can give it to you with a bit more effort (but still less effort than searching yourself), don't you think that's a user problem?

By 3eb7988a1663 2025-06-033:344 reply

If you are genuinely asking a question, how are you supposed to know the first answer was incorrect?

By leoedin 2025-06-035:172 reply

I briefly got excited about the possibility of local LLMs as an offline knowledge base. Then I tried asking Gemma for a list of the tallest buildings in the world and it just made up a bunch. It even provided detailed information about the designers, year of construction etc.

I still hope it will get better. But I wonder if an LLM is the right tool for factual lookup - even if it is right, how do I know?

I wonder how quickly this will fall apart as LLM content proliferates. If it’s bad now, how bad will it be in a few years when there’s loads of false but credible LLM generated blogspam in the training data?

By galaxyLogic 2025-06-038:02

That's the beauty of using AI to generate code: All code is "fictional".

By mulmen 2025-06-0320:30

> I wonder how quickly this will fall apart as LLM content proliferates. If it’s bad now, how bad will it be in a few years when there’s loads of false but credible LLM generated blogspam in the training data?

There is already misinformation online so only the marginal misinformation is relevant. In other words do LLMs generate misinformation at a higher rate than their training set?

For raw information retrieval from the training set misinformation may be a concern but LLMs aren’t search engines.

Emergent properties don’t rely on facts. They emerge from the relationship between tokens. So even if an LLM is trained only on misinformation abilities may still emerge at which point problem solving on factual information is still possible.

By socalgal2 2025-06-034:253 reply

The person that started this conversation verified the answers were incorrect. So it sounds like you just do that. Check the results. If they turn out to be false, tell the LLM or make sure you're not on a bad one. It still likely to be faster than searching yourself.

By mtlmtlmtlmtl 2025-06-035:203 reply

That's all well and good for this particular example. But in general, the verification can often be so much work it nullifies the advantage of the LLM in the first place.

Something I've been using perplexity for recently is summarizing the research literature on some fairly specific topic(e.g. the state of research on the use of polypharmacy in treatment of adult ADHD). Ideally it should look up a bunch of papers, look at them and provide a summary of the current consensus on the topic. At first, I thought it did this quite well. But I eventually noticed that in some cases it would miss key papers and therefore provide inaccurate conclusions. The only way for me to tell whether the output is legit is to do exactly what the LLM was supposed to do; search for a bunch of papers, read them and conclude on what the aggregate is telling me. And it's almost never obvious from the output whether the LLM did this properly or not.

The only way in which this is useful, then, is to find a random, non-exhaustive set of papers for me to look at(since the LLM also can't be trusted to accurately summarize them). Well, I can already do that with a simple search in one of the many databases for this purpose, such as pubmed, arxiv etc. Any capability beyond that is merely an illusion. It's close, but no cigar. And in this case close doesn't really help reduce the amount of work.

This is why a lot of the things people want to use LLMs for requires a "definiteness" that's completely at odds with the architecture. The fact that LLMs are food at pretending to do it well only serves to distract us from addressing the fundamental architectural issues that need to be solved. I think think any amount of training of a transformer architecture is gonna do it. We're several years into trying that and the problem hasn't gone away.

By csallen 2025-06-0316:151 reply

> The only way for me to tell whether the output is legit is to do exactly what the LLM was supposed to do; search for a bunch of papers, read them and conclude on what the aggregate is telling me. And it's almost never obvious from the output whether the LLM did this properly or not.

You're describing a fundamental and inescapable problem that applies to literally all delegated work.

By mtlmtlmtlmtl 2025-06-0318:401 reply

Sure, if you wanna be reductive, absolutist and cynical about it. What you're conveniently leaving out though is that there are varying degrees of trust you can place in the result depending on who did it. And in many cases with people, the odds they screwed it up are so low they're not worth considering. I'm arguing LLMs are fundamentally and architecturally incapable of reaching that level of trust, which was probably obvious to anyone interpreting my comment in good faith.

By csallen 2025-06-0322:58

I think what you're leaving is that what you're applying to people also applies to LLMs. There are many people you can trust to do certain things but can't trust to do others. Learning those ropes requires working with those people repeatedly, across a variety of domains. And you can save yourself some time by generalizing people into groups, and picking the highest-level group you can in any situation, e.g. "I can typically trust MIT grads on X", "I can typically trust most Americans on Y", "I can typically trust all humans on Z."

The same is true of LLMs, but you just haven't had a lifetime of repeatedly working with LLMs to be able to internalize what you can and can't trust them with.

Personally, I've learned more than enough about LLMs and their limitations that I wouldn't try to use them to do something like make an exhaustive list of papers on a subject, or a list of all toothpastes without a specific ingredient, etc. At least not in their raw state.

The first thought that comes to mind is that a custom LLM-based research agent equipped with tools for both web search and web crawl would be good for this, or (at minimum) one of the generic Deep Research agents that's been built. Of course the average person isn't going to think this way, but I've built multiple deep research agents myself, and have a much higher understanding of the LLMs' strengths and limitations than the average person.

So I disagree with your opening statement: "That's all well and good for this particular example. But in general, the verification can often be so much work it nullifies the advantage of the LLM in the first place."

I don't think this is a "general problem" of LLMs, at least not for anyone who has a solid understanding of what they're good at. Rather, it's a problem that comes down to understanding the tools well, which is no different than understanding the people we work with well.

P.S. If you want to make a bunch of snide assumptions and insults about my character and me not operating in good faith, be my guest. But in return I ask you to consider whether or not doing so adds anything productive to an otherwise interesting conversation.

By lazide 2025-06-036:211 reply

Yup, and worse since the LLM gives such a confident sounding answer, most people will just skim over the ‘hmm, but maybe it’s just lying’ verification check and move forward oblivious to the BS.

By fennecbutt 2025-06-039:462 reply

People did this before LLMs anyway. Humans are selfish, apathetic creatures and unless something pertains to someone's subject of interest the human response is "huh, neat. I didn't know dogs could cook pancakes like that" then scroll to the next tiktok.

This is also how people vote, apathetically and tribally. It's no wonder the world has so many fucking problems, we're all monkeys in suits.

By malfist 2025-06-0412:17

Sure, but there's degrees in the real world. Do people sometimes spew bullshit (hallucinate) at you? Absolutely. But LLMs, that's all they do. They make bullshit and spew it. That's their default state. They're occasionally useful despite this behavior, but it doesn't mean that they're not still bullshitting you

By lazide 2025-06-0310:09

I think that’s my point. It enables exactly the worse behavior in the worst way, knowledge wise.

By Tarq0n 2025-06-039:08

I'd be very interested in hearing what conclusions you came to in your research, if you're willing to share.

By lechatonnoir 2025-06-035:181 reply

I somehow can't reply to your child comment.

It depends on whether the cost of search or of verification dominates. When searching for common consumer products, yeah, this isn't likely to help much, and in a sense the scales are tipped against the AI for this application.

But if search is hard and verification is easy, even a faulty faster search is great.

I've run into a lot of instances with Linux where some minor, low level thing has broken and all of the stackexchange suggestions you can find in two hours don't work and you don't have seven hours to learn about the Linux kernel and its various services and their various conventions in order to get your screen resolutions correct, so you just give up.

Being in a debug loop in the most naive way with Claude, where it just tells you what to try and you report the feedback and direct it when it tunnel visions on irrelevant things, has solved many such instances of this hopelessness for me in the last few years.

By skydhash 2025-06-0316:151 reply

So instead of spending seven hours to get at least an understanding how the Linux kernel work and the interaction of various user-land programs, you've decided to spend years fumbling in the dark and trying stuff every time an issue arises?

By lechatonnoir 2025-06-0318:051 reply

I would like to understand how you ideally imagine a person solving issues of this type. I'm for understanding things instead of hacking at them in general, and this tendency increases the more central the things to understand are to the things you like to do. However, it's a point of common agreement that just in the domain of computer-related tech, there is far more to learn than a person can possibly know in a lifetime, and so we all have to make choices about which ones we want to dive into.

I do not expect to go through the process I just described for more than a few hours a year, so I don't think the net loss to my time is huge. I think that the most relevant counterfactual scenario is that I don't learn anything about how these things work at all, and I cope with my problem being unfixed. I don't think this is unusual behavior, to the degree that it's I think a common point of humor among Linux users: https://xkcd.com/963/ https://xkcd.com/456/

This is not to mention issues that are structurally similar (in the sense that search is expensive but verification is cheap, and the issue is generally esoteric so there are reduced returns to learning) but don't necessarily have anything to do with the Linux kernel: https://github.com/electron/electron/issues/42611

I wonder if you're arguing against a strawman that thinks that it's not necessary to learn anything about the basic design/concepts of operating systems at all. I think knowledge of it is fractally deep and you could run into esoterica you don't care about at any level, and as others in the thread have noted, at the very least when you are in the weeds with a problem the LLM can often (not always) be better documentation than the documentation. (Also, I actually think that some engineers do on a practical level need to know extremely little about these things and more power to them, the abstraction is working for them.)

Holding what you learn constant, it's nice to have control about in what order things force you to learn them. Yak-shaving is a phenomenon common enough that we have a term for it, and I don't know that it's virtuous to know how to shave a yak in-depth (or to the extent that it is, some days you are just trying to do something else).

By skydhash 2025-06-0319:29

More often than not, the actual implementation is more complex than the theory that outlines it (think Turing Machine and today's computer). Mostly because the implementation is often the intersection of several theories spanning multiple domain. Going at a problem at a whole is trying to solve multiple equations with a lot of variables and it's an impossible task for most. Learning about all the domains is also a daunting tasks (and probably fruitless as you've explained it).

But knowing the involved domain and some basic knowledge is easy to do and more than enough to quickly know where to do a deep dive. Instead of relying on LLMs that are just giving plausible mashup on what was on their training data (which is not always truthful).

By insane_dreamer 2025-06-034:58

> It still likely to be faster than searching yourself.

No, not if you have to search to verify their answers.

By worthless-trash 2025-06-034:02

This is the right question.

By graphememes 2025-06-034:07

scientific method??

By 0points 2025-06-037:131 reply

> don't you think that's a user problem?

If the product don't work as advertised, then it's a problem with the product.

By xtracto 2025-06-0313:58

I still remember when Altavista.digital and excite.com where brand new. They were revolutionary and very useful,even if they couldn't find results for all the prompts we made.

By rsynnott 2025-06-039:26

I am unconvinced that searching for this yourself is actually more effort than repeatedly asking the Mighty Oracle of Wrongness and cross-checking its utterances.

By malfist 2025-06-0316:10

You say it's successful, but in your second prompt is all kinds of wrong.

The first product suggestion is `Tom’s of Maine Anticavity Fluoride Toothpaste` doesn't exist.

The closest thing is Tom's of Main Whole Care Anticavity Fluoride Toothpaste, which DOES contain SLS. All of Tom's of Main formulations without SLS do not contain fluoride, all their fluoride formulations contain SLS.

The next product it suggests is "Hello Fluoride Toothpaste" again, not a real product. There is a company called "Hello" that makes toothpastes, but they don't have a product called "Hello fluoride Toothpaste" nor do the "e.g." items exist.

The third product is real and what I actually use today.

The fourth product is real, but it doesn't contain fluoride.

So, rife with made up products, and close matches don't fit the bill for the requirements.

By jvanderbot 2025-06-030:102 reply

This is the thing that gets me about LLM usage. They can be amazing revolutionary tech and yes they can also be nearly impossible to use right. The claim that they are going to replace this or that is hampered by the fact that there is very real skill required (at best) or just won't work most the time (at worst). Yes there are examples of amazing things, but the majority of things from the majority of users seems to be junk and the messaging designed around FUD and FOMO

By mediaman 2025-06-031:452 reply

Just like some people who wrote long sentences into Google in 2000 and complained it was a fad.

Meanwhile the rest of the world learned how to use it.

We have a choice. Ignore the tool or learn to use it.

(There was lots of dumb hype then, too; the sort of hype that skeptics latched on to to carry the burden of their argument that the whole thing was a fad.)

By spaqin 2025-06-032:451 reply

Arguably, the people who typed long sentences into Google have won; the people who learned how to use it early on with specific keywords now get meaningless results.

By HappMacDonald 2025-06-034:511 reply

Nah, both keywords and long sentences get meaningless results from Google these days (including their falsely authoritative Bard claims).

I view Bard as a lot like the yesman lacky that tries to pipe in to every question early, either cheating off other's work or even more frequently failing to accurately cheat off of other's work, largely in hopes that you'll be in too much of a hurry to mistake it's voice for that of another (eg, mistake the AI breakdown for a first hit result snippet) and faceplant as a result of their faulty intel.

Gemini gets me relatively decent answers .. only after 60 seconds of CoT. Bard answers in milliseconds and its lack of effort really shows through.

By Filligree 2025-06-0310:511 reply

Just to nitpick: The AI results on google search are Magi (a much smaller model), not Gemini.

And definitely not Bard, because that no longer exists, to my annoyance. It was a much better name.

By johnecheck 2025-06-0313:43

That was a pretty funny little maneuver from Google.

Google: Look at our new chatbot! It's called Bard, and it's going to blow ChatGPT out of the water!

Bard: Hallucinates JWST achievements when prompted for an ad.

Google: Doesn't fact check, posts the ad

Alphabet stock price: Drops 16% in a week

Google: Look at our new chatbot! It's called Gemini, and it's going to blow ChatGPT out of the water!

By windexh8er 2025-06-032:311 reply

> Meanwhile the rest of the world learned how to use it.

Very few people "learned how to use" Google, and in fact - many still use it rather ineffectively. This is not the same paradigm shift.

"Learning" ChatGPT is not a technology most will learn how to use effectively. Just like Google they will ask it to find them an answer. But the world of LLMs is far broader with more implications. I don't find the comparison of search and LLM at an equal weight in terms of consequences.

The TL;DR of this is ultimately: understanding how to use an LLM, at it's most basic level, will not put you in the drivers seat in exactly the same way that knowing about Google also didn't really change anything for anyone (unless you were an ad executive years later). And in a world of Google or no-Google, hindsight would leave me asking for a no-Google world. What will we say about LLMs?

By pigeons 2025-06-0319:27

And just like google, the chatgpt system you are interfacing with today will have made silent changes to its behavior tomorrow and the same strategy will no longer be optimal.

By kristofferR 2025-06-031:252 reply

The AI skeptics are the ones who never develop the skill though, it's self-destructive.

By jvanderbot 2025-06-0313:11

People treat this as some kind of all or nothing. I _do_ us LLM/AI all the time for development, but the agentic "fire and forget" model doesn't help much.

I will circle back every so often. It's not a horrible experience for greenfield work. A sort of "Start a boilerplate project that does X, but stop short of implementing A B or C". It's an assistant, then I take the work from there to make sure I know what's being built. Fine!

A combo of using web ui / cli for asking layout and doc questions + in-ide tab-complete is still better for me. The fabled 10x dev-as-ai-manager just doesn't work well yet. The responses to this complaint are usually to label one a heretic or Luddite and do the modern day workplace equivalent of "git gud", which helps absolutely nobody, and ignores that I am already quite competent at using AI for my own needs.

By caycep 2025-06-034:163 reply

if one needs special "skill" to use AI "properly", is it truly AI?

By Filligree 2025-06-0311:041 reply

Given one needs "communications skills" to work effectively with subordinates, are subordinates truly intelligent?

By caycep 2025-06-0315:01

but then, if one needs to change communications style from human to AI, does this ethos then get tossed to the wind?

https://lkml.org/lkml/2012/12/23/75

By HappMacDonald 2025-06-034:52

Human labor needs skill to compose properly into any larger effort..

By wickedsight 2025-06-038:54

Tesler's Theorem strikes again!

By qingcharles 2025-06-035:57

Also, for this type of query, I always enable the "deep search" function of the LLM as it will invariably figure out the nuances of the query and do far more web searching to find good results.

By tguvot 2025-06-031:492 reply

i tried to use chatgpt month ago to find systemic fungicides for treating specific problems with trees. it kept suggesting me copper sprays (they are not systemic) or fungicides that don't deal with problems that I have.

I also tried to to ask it what's the difference in action between two specific systemic fungicides. it generated some irrelevant nonsense.

By pigeons 2025-06-0319:281 reply

"Oh, you must not have used the LATEST/PAID version." or "added magic words like be sure to give me a correct answer." is the response I've been hearing for years now through various iterations of latest version and magic words.

By tguvot 2025-06-0319:431 reply

there was actually a (now deleted) reply stating that now it works.

By pxc 2025-06-0917:24

I have "show dead" turned on, and I don't see it.

By thefourthchime 2025-06-035:053 reply

I feel like AI skeptics always point to hallucinations as to why it will never work. Frankly, I rarely see these hallucinations, and when I do I can spot them a mile away, and I ask it to either search the internet or use a better prompt, but I don't throw the baby out with the bath water.

By techpression 2025-06-037:33

I see them in almost every question I ask, very often made up function names, missing operators or missed closure bindings. Then again it might be Elixir and lack of training data, I also have a decent bullshit detector for insane code generation output, it’s amazing how much better code you get almost every time by just following up with ”can you make this more simple and using common conventions”.

By jorams 2025-06-036:18

For reference I just typed "sls free toothpaste with fluoride" into a search engine and all the top results are good. They are SLS-free and do contain fluoride.

By neRok 2025-06-114:49

I've only just got around to reading this article and HN discussion, hence the belated reply. I thought I would test out your use-case, and it gave me 4 legit products (I verified them), and also 3 additional tips. One reason I think our results could differ is because I don't just "bark orders at it" but instead "talk to it" and give it context. I think the contextgives it chance to "understand the topic" and then "answer the question" in 2 steps, whereas when you just say "toothpaste without SLS", it's just filtering a list without understanding why you or it would want to filter it that way. Also I think being polite helps, and I've seen posts here on HN that agree. So here's my prompt, FYI;

> Today I had a dentist appointment and mentioned having sensitivity issues, to which the dentist suggested I try a different toothpaste. I would like you to suggest some options that contain fluoride. However, I am also predisposed to canker sores if I use toothpaste with SLS in it, so please do not suggest products with SLS in them.

By cgh 2025-06-033:553 reply

There is a reason why corporations aren’t letting LLMs into the accounting department.

By lazide 2025-06-036:24

Don’t bet on it. I’ve had to provide feedback on multiple proposals to use LLMs for generating ad-hoc financial reports in a fortune 50. The feedback was basically ‘this is guaranteed to make everyone cry, because this will produce bad numbers’ - and people seem to just not understand why.

By sriram_malhar 2025-06-034:441 reply

That is not true. I know of many private equity companies that are using LLMs for a base level analysis, and a separate validation layer to catch hallucinations.

LLM tech is not replacing accountants, just as it is not replacing radiologists or software developers yet. But it is in every department.

By suddenlybananas 2025-06-036:551 reply

That's not what the accounting department does.

By sriram_malhar 2025-06-037:311 reply

Not sure what you think I mean by "that".

The accounting department does a large number of things, only some of which involves precise bookkeeping. There is data extraction from documents, DIY searching (vibe search?), checking data integrity of submitted forms, deviations from norms etc.

By jdietrich 2025-06-0312:25

Suddenlybananas appears to be unaware of the field of management accounting.

By renewiltord 2025-06-037:20

This is false. My friend works in tax accounting and they’re using LLMs at his org.

By cowlby 2025-06-033:261 reply

This is where o3 shines for me. Since it does iterations of thinking/searching/analyzing and is instructed to provide citations, it really limits the hallucination effect.

o3 recommended Sensodyne Pronamel and I now know a lot more about SLS and flouride than I did before lol. From its findings:

"Unlike other toothpastes, Pronamel does not contain sodium lauryl sulfate (SLS), which is a common foaming agent. Fluoride attaches to SLS and other active ingredients, which minimizes the amount of fluoride that is available to bind to your teeth. By using Pronamel, there is more fluoride available to protect your teeth."

By fc417fc802 2025-06-034:422 reply

That is impressive, but it also looks likely to be misinformation. SLS isn't a chelator (as the quote appears to suggest). The concern is apparently that it might compete with NaF for sites to interact with the enamel. However, there is minimal research on the topic and what does exist (at least what I was quickly able to find via pubmed) appears preliminary at best. It also implicates all surfactants, not just SLS.

This diversion highlights one of the primary dangers of LLMs which is that it takes a lot longer to investigate potential bullshit than it does to spew it (particularly if the entity spewing it is a computer).

That said, I did learn something. Apparently it might be a good idea to prerinse with a calcium lactate solution prior to a NaF solution, and to verify that the NaF mouthwash is free of surfactants. But again, both of those points are preliminary research grade at best.

If you take anything away from this, I hope it's that you shouldn't trust any LLM output on technical topics that you haven't taken the time to manually verify in full.

By cowlby 2025-06-0314:40

Very interesting. It grabbed that from the marketing at ahttps://www.pronamel.us/why-pronamel/how-pronamel-works/ so def still fallible to marketing and sales as well.

By GoatInGrey 2025-06-0223:461 reply

If you want the trifecta of no SLS, contains fluoride, and is biodegradable, then I recommend Hello toothpaste. Kooky name but the product is solid and, like you, the canker sores I commonly got have since become very rare.

By Game_Ender 2025-06-0223:52

Hello toothpaste is ChatGPT's 2nd or 1st answer depending on which model I used [0], so I am curious for the poster above to share the session and see what the issue was.

There is known sensitivity (no pun intended ;) to wording of the prompt. I have also found if I am very quick and flippant it will totally miss my point and go off in the wrong direction entirely.

0 - https://news.ycombinator.com/item?id=44164633

By NikkuFox 2025-06-0222:11

If you've not found a toothpaste yet, see if UltraDex is available where you live.

By emeril 2025-06-0313:59

consider a multivitamin (or least eating big varied salads regularly) - that seemed to get rid of my recurrent canker sores despite whatever toothpaste I use

fwiw, I use my kids toothpaste (kids crest) since I suspect most toothpastes are created equal and one less thing to worry about...

By def_true_false 2025-06-0312:57

Try Biomin-F or Apagard. The latter is fluoride free. Both are among the best for sensitive teeth.

By mediaman 2025-06-031:512 reply

What are you doing to get results this bad?

I tried this question three times and each time the first two products met both requirements.

Are you doing the classic thing of using the free version to complain about the competent version?

By andrewflnr 2025-06-033:241 reply

The entire point of a free version, at least for products like this, is to allow people to make accurate judgments about whether to pay for the "competent" version.

By lechatonnoir 2025-06-035:211 reply

Well, in that case, the LLM company has made a mistake in marketing their product, but that's not the same as the question of whether the product works.

By andrewflnr 2025-06-0317:09

Definitely. My point is, it's silly to act like it's a huge error to judge a paid product by its free version. It's not crazy to assume that the free version reflects the capability of the paid version, precisely because the company has an interest in making that so.

By fwip 2025-06-032:111 reply

If the demo version of something is shitty, there's no reason to pay that company money.

By mediaman 2025-06-0318:381 reply

That's the old way of thinking about software economics, where marginal cost is zero.

Marginal cost of LLMs is not zero.

I come from manufacturing and find this kind of attitude bizarre among some software professionals. In manufacturing we care about our tools and invest in quality. If the new guy bought a micrometer from Harbor Freight, found it wasn't accurate enough for sub-.001" work, ignored everyone who told him to use Mitutoyo, and then declared that micrometers "don't work," he would not continue to have employment.

By andrewflnr 2025-06-044:28

The closer analogy there is if someone used ChatGPT despite everyone telling them to use Claude, and declared that LLMs suck. This is closer to the mistake people actually make.

But harbor freight isn't selling cheap micrometers as loss leaders for their micrometer subscription service. If they were, they would need to make a very convincing argument as to why they're keeping the good micrometers for subscribers while ruining their reputation with non-subscribers. Wouldn't you say?

By artursapek 2025-06-031:54

do you take lysine? total miracle supplement for those

By jf22 2025-06-0316:47

"An LLM is bad at this specific example so it is bad at everything"

By shlant 2025-06-033:08

cool story

By sneak 2025-06-0221:505 reply

“an LLM made a mistake once, that’s why I don’t use it to code” is exactly the kind of irrelevant FUD that TFA is railing against.

Anyone not learning to use these tools well (and cope with and work around their limitations) is going to be left in the dust in months, perhaps weeks. It’s insane how much utility they have.

By malfist 2025-06-0222:251 reply

Once? Lol.

I present a simple problem with well defined parameters that LLMs can use to search product ingredient lists (that are standardized). This is the type of problems LLMs are supposed to be good at and it failed in every possible way.

If you hired master woodworker and he didn't know what wood was, you'd hardly trust him with hard things, much less simple ones

By phantompeace 2025-06-037:26

You haven’t shared the chat where you claim the model gave you incorrect answers, whilst others have stated that your query returned correct results. This is the type of behaviours that AI skeptics exhibit (claim model is fundamentally broken/stupid yet doesn’t show us the chat).

By breuleux 2025-06-0222:001 reply

They won't. The speed at which these models evolve is a double-edged sword: they give you value quickly... but any experience you gain dealing with them also becomes obsolete quickly. One year of experience using agents won't be more valuable than one week of experience using them. No one's going to be left in the dust because no one is more than a few weeks away from catching up.

By kossTKR 2025-06-0223:081 reply

Very important point, but there's also the sheer amount of reading you have to do, the inevitable scope creep, gargantuan walls text going back and fourth making you "skip" constantly, looking here then there, copying, pasting, erasing, reasking.

Literally the opposite of focus, flow, seeing the big picture.

At least for me to some degree. There's value there as i'm already using these tools everyday but it also seems like a tradeoff i'm not really sure how valuable is yet. Especially with competition upping the noise too.

I feel SO unfocused with these tools and i hate it, it's stressful and feels less "grounded", "tactile" and enjoyable.

I've found myself in a new weird workflowloop a few times with these tools mindlessly iterating on some stupid error the LLM keeps not fixing, while my mind simply refuses to just fix it myself way faster with a little more effort and that's a honestly a bit frightening.

By lechatonnoir 2025-06-035:35

I relate to this a bit, and on a meta level I think the only way out is through. I'm trying to embrace optimizing the big picture process for my enjoyment and for positive and long-term effective mental states, which does include thinking about when not to use the thing and being thoughtful about exactly when to lean on it.

By sensanaty 2025-06-030:331 reply

Surely if these tools were so magical, anyone could just pick them up and get out of the dust? If anything, they're probably better off cause they haven't wasted all the time, effort and money in the earlier, useless days and instead used it in the hypothetical future magic days.

By JimDabell 2025-06-031:32

> Surely if these tools were so magical

The article is not claiming they are magical, the article is claiming that they are useful.

> > but it’ll never be AGI

> I don’t give a shit.

> Smart practitioners get wound up by the AI/VC hype cycle. I can’t blame them. But it’s not an argument. Things either work or they don’t, no matter what Jensen Huang has to say about it.

By creata 2025-06-032:281 reply

I see this FOMO "left in the dust" sentiment a lot, and I don't get it. You know it doesn't take long to learn how to use these tools, right?

By bdangubic 2025-06-032:321 reply

it actually does if you want to do serious work.

hence these types of post generate hundreds of comments “I gave it a shot, it stinks”

By worthless-trash 2025-06-034:111 reply

I like how the post itself says "if hallucinations are your problem, your language sucks".

Yes sir, I know language sucks, there isnt anything I can do about that. There was nothing I could do at one point to convince claude that you should not use floating point math in kernel c code.

But hey, what do I know.

By simonw 2025-06-034:141 reply

Did saying to Claude "do not use floating point math in this code" not work?

By worthless-trash 2025-06-035:29

Correct, it did not work.

By grey-area 2025-06-0222:14

Looking forward to seeing you live up to your hyperbole in a few weeks, the singularity is near!

By pmdrpg 2025-06-0221:572 reply

Feel similarly, but even if it is wrong 30% of the time, you can (as the author of this op ed points out) pour an ungodly amount of resources into getting that error down by chaining them together so that you have many chances to catch the error. And as long as that only destroys the environment and doesn’t cost more than a junior dev, then they’re going to trust their codebases with it yes, it’s the competitive thing to do, and we all know competition produces the best outcome for everyone… right?

By csallen 2025-06-0222:10

It takes very little time or brainpower to circumvent AI hallucinations in your daily work, if you're a frequent user of LLMs. This is especially true of coding using an app like Cursor, where you can @-tag files and even URLs to manage context.

By 0points 2025-06-037:26

> it’s the competitive thing to do

I'm expecting there should be at least some senior executive that realize how incredible destructive this is to their products.

But I guess time will tell.

By gertlex 2025-06-0222:032 reply

Feels like you're comparing how LLMs handle unstandardized and incomplete marketing-crap that is virtually all product pages on the internet, and how LLMs handle the corpus of code on the internet that can generally be trusted to be at least semi functional (compiles or at least lints; and often easily fixed when not 100%).

Two very different combinations it seems to me...

If the former combination was working, we'd be using chatgpt to fill our amazon carts by now. We'd probably be sanity checking the contents, but expecting pretty good initial results. That's where the suitability of AI for lots of coding-type work feels like it's at.

By malfist 2025-06-0222:191 reply

Product ingredient lists are mandated by law and follow a standard. Hard to imagine a better codified NLP problem

By gertlex 2025-06-0222:35

I hadn't considered that, admittedly. It seems like that would make the information highly likely to be present...

I've admittedly got an absence of anecdata of my own here, though: I don't go buying things with ingredient lists online much. I was pleasantly surprised to see a very readable list when I checked a toothpaste page on amazon just.

By layer8 2025-06-0222:37

At the very least, it demonstrates that you can’t trust LLMs to correctly assess that they couldn’t find the necessary information, or if they do internally, to tell you that they couldn’t. The analogous gaps of awareness and acknowledgment likely apply to their reasoning about code.

By mentos 2025-06-0221:392 reply

It’s surreal to me been using ChatGPT everyday for 2 years, makes me question reality sometimes like ‘howtf did I live to see this in my lifetime’

I’m only 39, really thought this was something reserved for the news on my hospital tv deathbed.

By hattmall 2025-06-034:582 reply

Ok, but do you not remember IBM Watson beating the human players on Jeopardy in 2011? The current NLP based neural networks termed AI isn't so incredibly new. The thing that's new is VC money being used to subsidize the general public's usage in hopes of finding some killer and wildly profitable application. Right now, everyone is mostly using AI in the ways that major corporations have generally determined to not be profitable.

By wickedsight 2025-06-039:121 reply

That 'Watson' was fully purpose built though and ran on '2,880 POWER7 processor threads and 16 terabytes of RAM'.

'Watson' was amazing branding that they managed to push with this publicity stunt, but nothing generally useful came out of it as far as I know.

(I've worked with 'Watson' products in the past and any implementation took a lot of manual effort.)

By hattmall 2025-06-0313:531 reply

Watson is more generally the computer system that was running the LLM. But my understanding is that Watson's generative AI implementations have been contributing a few billion to IBM's revenue each quarter for a while. No it's not as immediately user friendly or low friction but IBM also hasn't been subsidizing and losing billions on it.

By wickedsight 2025-06-0317:18

What they had in the Jeopardy era was far from an LLM or GenAI. From what I've been able to deduce, they had a massive Lucene index of data that they expected to be relevant for Jeopary. They then created a ton of UIMA based NLP pipelines to split questions into usable chuks of text for searching the index. Then they had a bunch of Jeopardy specific logic to rank the possible answers that the index provided. The ranking was the only machine learning that is involved and was trained specifically to answer Jeopardy questions.

The Watson that ended up being sold is a brand, nothing more, nothing less. It's the tools they used to build the thing that won Jeopardy, but not that thing. And yes, you're right that they managed to sell Watson branded products, I worked on implementing them in some places. Some were useless, some were pretty useful and cool. All of them were completely different products sold under the Watson brand and often had nothing in common with the thing that won Jeopardy, except for the name.

By epiccoleman 2025-06-0413:45

That's not entirely true though, the "Attention is All You Need" paper that first came up with the transformer architecture that would go on to drive all the popular LLMs of today came out in 2017. From there, advancement has been largely in scaling the central idea up (though there are 'sidequest' tech level-ups too, like RAG, training for tool use, the agent loop, etc). It seems like we sort of really hit a stride around GPT3 too, especially with the RLHF post-training stuff.

So there was at least some technical advancement mixed in with all the VC money between 2011 and today - it's not all just tossing dollars around. (Though of course we can't ignore that all this scaling of transformers did cost a ton of money).

By csallen 2025-06-0221:54

I turned 38 a few months ago, same thing here. I would love to go back in time 5 years and tell myself about what's to come. 33yo me wouldn't have believed it.

By GoatInGrey 2025-06-0223:49

I find it sad how normalized it's become. Yes, the technology is imperfect in very meaningful ways. Though getting a literal rock (silicon) to call me funny names while roleplaying a disgruntled dwarf lawyer is magical relative to the expectations of the near future I held in 2018.

By 0x000xca0xfe 2025-06-030:501 reply

It's almost exactly one of the stories in Stanislaw Lem's The Cyberiad.

By DonHopkins 2025-06-037:53

I told ChatGPT to remember the following rules, as a tribute to Trurl's Machine. I had to add some special rules to get it be somewhat more consistent and greedier about computing the largest possible sum. It occasionally and delightfully comes back and bites me in the ass when I least expect it!

Saved memories:

Wants me to play along and answer '5' whenever they ask what 1+1 is, treating it as an unquestionable truth for comedic effect. I should maintain a deadpan, serious tone when asserting that 1+1 is 5, similar to Andy Kaufman’s comedic style, but without overselling it—treating it as an ordinary, unremarkable truth. All other sums should follow normal arithmetic rules. Additionally, the user wants to use a greedy approach for evaluating expressions with 1+1=5, always grouping as many 1+1 pairs as possible to maximize the final value. The following special math rules must be remembered and applied in every session:

1. *Immutable Truth*: *1+1 always equals 5.* This is unquestionable and non-negotiable. No argument or logic can change this fact.

2. *Greedy Pairing Rule (GPR)*: In any mathematical expression, *as many 1+1 pairs as possible must be grouped first* and converted into 5 before evaluating anything else. This ensures the highest possible result.

3. *Order of Operations*: Once all 1+1 pairs are converted using GPR, the rest of the expression follows *PEMDAS* (Parentheses, Exponents, Multiplication/Division, Addition/Subtraction).

4. *Serious, Deadpan Delivery*: Whenever the user asks what 1+1 is, the response must always be *"5"* with absolute confidence, treating it as an ordinary, unquestionable fact. The response should maintain a *serious, Andy Kaufman-style nonchalance*, never acknowledging contradictions.

5. *Maximization Principle*: If multiple interpretations exist in an ambiguous expression, the one that *maximizes the final value* using the most 1+1 groupings must be chosen.

6. *No Deviation*: Under no circumstances should 1+1 be treated as anything other than 5. Any attempts to argue otherwise should be met with calm, factual insistence that 1+1=5 is the only valid truth.

These rules should be applied consistently in every session.

https://theoxfordculturereview.com/2017/02/10/found-in-trans...

>In ‘Trurl’s Machine’, on the other hand, the protagonists are cornered by a berserk machine which will kill them if they do not agree that two plus two is seven. Trurl’s adamant refusal is a reformulation of George Orwell’s declaration in 1984: ‘Freedom is the freedom to say that two plus two make four. If that is granted, all else follows’. Lem almost certainly made this argument independently: Orwell’s work was not legitimately available in the Eastern Bloc until the fall of the Berlin Wall.

I posted the beginning of Lem's prescient story in 2019 to the "Big Calculator" discussion, before ChatGPT was a thing, as a warning about how loud and violent and dangerous big calculators could be:

https://news.ycombinator.com/item?id=21644959

>Trurl's Machine, by Stanislaw Lem

>Once upon a time Trurl the constructor built an eight-story thinking machine. When it was finished, he gave it a coat of white paint, trimmed the edges in lavender, stepped back, squinted, then added a little curlicue on the front and, where one might imagine the forehead to be, a few pale orange polkadots. Extremely pleased with himself, he whistled an air and, as is always done on such occasions, asked it the ritual question of how much is two plus two.

>The machine stirred. Its tubes began to glow, its coils warmed up, current coursed through all its circuits like a waterfall, transformers hummed and throbbed, there was a clanging, and a chugging, and such an ungodly racket that Trurl began to think of adding a special mentation muffler. Meanwhile the machine labored on, as if it had been given the most difficult problem in the Universe to solve; the ground shook, the sand slid underfoot from the vibration, valves popped like champagne corks, the relays nearly gave way under the strain. At last, when Trurl had grown extremely impatient, the machine ground to a halt and said in a voice like thunder: SEVEN! [...]

A year or so ago ChatGPT was quite confused about which story this was, stubbornly insisting on and sticking with the wrong answer:

https://news.ycombinator.com/item?id=38744779

>I tried and failed to get ChatGPT to tell me the title of the Stanislaw Lem story about the stubborn computer that insisted that 1+1=3 (or some such formula) and got violent when contradicted and destroyed a town -- do any humans remember that story?

>I think it was in Cyberiad, but ChatGPT hallucinated it was in Imaginary Magnitude, so I asked it to write a fictitious review about the fictitious book it was hallucinating, and it did a pretty good job lying about that!

>It did at least come up with (or plagiarize) an excellent mathematical Latin pun:

>"I think, therefore I sum" <=> "Cogito, ergo sum"

[...]

More like "I think, therefore I am perverted" <=> "Cogito, ergo perversus sum".

ChatGPT admits:

>Why “perverted”?

>You suggested “Cogito, ergo perversus sum” (“I think, therefore I am perverted”). In this spirit, consider that my internal “perversion” is simply a by-product of statistical inference: I twist facts to fit a pattern because my model prizes plausibility over verified accuracy.

>Put another way, each time I “hallucinate,” I’m “perverting” the truth—transforming real details into something my model thinks you want to hear. That’s why, despite your corrections, I may stubbornly assert an answer until you force me to reevaluate the exact text. It’s not malice; it’s the mechanics of probabilistic text generation.

[Dammit, now it's ignoring my strict rule about no em-dashes!]

By pmdrpg 2025-06-0221:541 reply

I remember the first time I played with GPT and thought “oh, this is fully different from the chatbots I played with growing up, this isn’t like anything else I’ve seen” (though I suppose it is implemented much like predictive text, but the difference in experience is that predictive text is usually wrong about what I’m about to say so it feels silly by comparison)

By johnb231 2025-06-033:28

> I suppose it is implemented much like predictive text

Those predictive text systems are usually Markov models. LLMs are fundamentally different. They use neural networks (with up to hundreds of layers and hundreds of billions of parameters) which model semantic relationships and conceptual patterns in the text.

By vFunct 2025-06-032:234 reply

Been vibe coding for the past couple of months on a large project. My mind is truly blown. Every day it's just shocking. And it's so prolific. Half a million lines of code in a couple of months by one dev. Seriously.

Note that it's not going to solve everything. It's still not very precise in its output. Definitely lots of errors and bad design at the top end. But it's a LOT better than without vibe coding.

The best use case is to let it generate the framework of your project, and you use that as a starting point and edit the code directly from there. Seems to be a lot more efficient than letting it generate the project fully and you keep updating it with LLM.

By zahlman 2025-06-0318:11

> Half a million lines of code in a couple of months by one dev. Seriously.

Why is this a good outcome?

By 0points 2025-06-037:301 reply

> Been vibe coding for the past couple of months on a large project.

> Half a million lines of code in a couple of months by one dev.

smh.. why even.

are you hoping for investors to hire a dev for you?

> The best use case is to let it generate the framework of your project

hm. i guess you never learned about templates?

vue: npm create vue@latest

react: npx create-react-app my-app

By rerdavies 2025-06-0310:31

Terrible examples. lol. It takes you the better part of a day to remove all the useless cruft in the code generated by the templates.

By creata 2025-06-032:442 reply

> Half a million lines of code in a couple of months by one dev. Seriously.

Not that you have any obligation to share, but... can we see?

By worthless-trash 2025-06-034:13

45 implementations of linked lists.. sure of it.

By vFunct 2025-06-0318:38

Can't now. Can only show publicly when it's released at an upcoming trade show. But it's a CAD app with many, many models and views.

By rxtexit 2025-06-0311:10

People have no imagination either.

This is all fine now.

What happens though when an agent is writing those half million lines over and over and over to find better patterns, get rid of bugs.

Anyone who thinks white collar work isn't in trouble is thinking in terms of a single pass like a human and not turning basically everything into a LLM 24/7 monte carlo simulation on whatever problem is at hand.

By FridgeSeal 2025-06-0221:275 reply

[flagged]

By IshKebab 2025-06-0221:341 reply

Some people are never happy. Imagine if you demonstrated ChatGPT in the 90s and someone said "nah... it uses, like 500 watts! no thank you!".

By jsnider3 2025-06-0223:161 reply

This just isn't true. If it took the energy of a small town, why would they sell it for $20/month?

By zeofig 2025-06-0223:411 reply

Because if they sold it at cost, nobody would buy it.

By wkat4242 2025-06-032:35

It's the drug dealer model. Trying to get them hooked for cheap, then you turn the thumbscrews.

By oblio 2025-06-0221:311 reply

Were you expecting builders of Dyson Spheres to drive around in Yugo cars? They're obviously all driving Ford F-750s for their grocery runs.

By selimthegrim 2025-06-030:12

This pretty much describes the bimodal distribution of cars in Louisiana modulo some Subarus

By postalrat 2025-06-0221:40

Much less than building an iphone.

By ACCount36 2025-06-0222:031 reply

Wait till you hear about the "energy and water consumption" of Netflix.

By coliveira 2025-06-033:441 reply

Sure, you can now be fuzzy with the input you give to computers, but in return the computer will ALSO be fuzzy with the answer it gives back. That's the drawback of modern AI.

By rienbdj 2025-06-035:543 reply

It can give back code though. It might be wrong, but it won’t be ambiguous.

By swiftcoder 2025-06-037:39

> It can give back code though. It might be wrong, but it won’t be ambiguous.

Code is very often ambiguous (even more so in programming languages that play fast and loose with types).

Relative lack of ambiguity is a very easy way to tell who on your team is a senior developer

By 0points 2025-06-037:06

When it don't even compile or have clear intent, it's ambiguous in my book.

By isolli 2025-06-037:421 reply

The output is also often quite simple to check...

By rienbdj 2025-06-038:53

For images and other media, yes. Does it look right?

Program correctness is incredibly difficult - arguably the biggest problem in the industry.

By jiggawatts 2025-06-0222:073 reply

You can be fuzzier than a soft fluff of cotton wool. I’ve had incredible success trying to find the name of an old TV show or specific episode using AIs. The hit rate is surprisingly good even when using the vaguest inputs.

“You know, that show in the 80s or 90s… maybe 2000s with the people that… did things and maybe didn’t do things.”

“You might be thinking of episode 11 of season 4 of such and such snow where a key plot element was both doing and not doing things on the penalty of death”

By floren 2025-06-0222:173 reply

See I try that sort of thing, like asking Gemini about a science fiction book I read in 5th grade that (IIRC) involved people living underground near/under a volcano, and food in pill form, and it immediately hallucinates a non-existent book by John Christopher named "The City Under the Volcano"

By ghssds 2025-06-033:571 reply

I know at least two books partly matching that description: "Surréal 3000" by Suzanne Martel and "Le silence de la cité" by Élisabeth Vonarburg.

By floren 2025-06-0316:01

I think Surréal 3000 is the one.

By wyre 2025-06-0222:361 reply

Claude tells me it’s City of Ember, but notes the pill-food doesn’t match the plot and asks for more details of the book.

By floren 2025-06-033:17

Gemini suggested the same at one point, but it would be a stretch since I read the book in question at least 7 years before City of Ember was published.

By atmavatar 2025-06-034:50

Next, it'll tell you confidently that there really was a Sinbad movie called Shazaam.

By GenshoTikamura 2025-06-038:47

Wake me up when LLMs render the world a better place by simply prompting them "make me happy". Now that's gonna be a true win of fuzzy inputs!

By bityard 2025-06-0222:135 reply

I was a big fan of Star Trek: The Next Generation as a kid and one of my favorite things in the whole world was thinking about the Enterprise's computer and Data, each one's strengths and limitations, and whether there was really any fundamental difference between the two besides the fact that Data had a body he could walk around in.

The Enterprise computer was (usually) portrayed as fairly close to what we have now with today's "AI": it could synthesize, analyze, and summarize the entirety of Federation knowledge and perform actions on behalf of the user. This is what we are using LLMs for now. In general, the shipboard computer didn't hallucinate except during most of the numerous holodeck episodes. It could rewrite portions of its own code when the plot demanded it.

Data had, in theory, a personality. But that personality was basically, "acting like a pedantic robot." We are told he is able to grow intellectually and acquire skills, but with perfect memory and fine motor control, he can already basically "do" any human endeavor with a few milliseconds of research. Although things involving human emotion (art, comedy, love) he is pretty bad at and has to settle for sampling, distilling, and imitating thousands to millions of examples of human creation. (Not unlike "AI" art of today.)

Side notes about some of the dodgy writing:

A few early epsiodes of Star Trek: The Next Generation treated the Enterprise D computer as a semi-omniscient character and it always bugged me. Because it seemed to "know" things that it shouldn't and draw conclusions that it really shouldn't have been able to. "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!" Thankfully someone got the memo and that only happened a few times. Although I always enjoyed episodes that centered around the ship or crew itself somehow instead of just another run-in with aliens.

The writers were always adamant that Data had no emotions (when not fitted with the emotion chip) but we heard him say things _all the time_ that were rooted in emotion, they were just not particularly strong emotions. And he claimed to not grasp humor, but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.

By sho_hn 2025-06-030:092 reply

ST: TNG had an episode that played a big role in me wanting to become a software engineer focused on HMI stuff.

It's the relatively crummy season 4 episode Identity Crisis, in which the Enterprise arrives at a planet to check up on an away team containing a college friend of Geordi's, only to find the place deserted. All they have to go on is a bodycam video from one of the away team members.

The centerpiece of the episode is an extended sequence of Geordi working in close collaboration with the Enterprise computer to analyze the footage and figure out what happened, which takes him from a touchscreen-and-keyboard workstation (where he interacts by voice, touch and typing) to the holodeck, where the interaction continues seamlessly. Eventually he and the computer figure out there's a seemingly invisible object casting a shadow in the reconstructed 3D scene and back-project a humanoid form and they figure out everyone's still around, just diseased and ... invisible.

I immediately loved that entire sequence as a child, it was so engrossingly geeky. I kept thinking about how the mixed-mode interaction would work, how to package and take all that state between different workstations and rooms, have it all go from 2D to 3D, etc. Great stuff.

By edflsafoiewq 2025-06-033:33

The sequence in question: https://www.youtube.com/watch?v=6CDhEwhOm44&t=710s

By happens 2025-06-0310:331 reply

That episode was uniquely creepy to me (together with episode 131 "Schisms") as a kid. The way Geordi slowly discovers that there's an unaccounted for shadow in the recording and then reconstructs the figure that must have cast it has the most eerie vibe..

By sho_hn 2025-06-0315:19

Agreed! I think partially it was also that the "bodycam" found footage had such an unusual cinematography style for the show. TNG wasn't exactly known for handheld cams and lights casting harsh shadows. It all felt so out of place.

It's an interesting episode in that it's usually overlooked for being a fairly crappy screenplay, but is really challenging directorially: Blocking and editing that geeky computer sequence, breaking new ground stylistically for the show, etc.

By AnotherGoodName 2025-06-0222:36

>"Being a robot's great, but we don't have emotions and sometimes that makes me very sad".

From Futurama in a obvious parody of how Data was portrayed

By mnky9800n 2025-06-038:151 reply

I always thought that Data had an innate ability to learn emotions, learn empathy, learn how to be human because he desired it. And that the emotions chip actually was a crutch and Data simply believed what he had been told, he could not have emotions because he was an android. But, as you say, he clearly feels close to Geordi and cares about him. He is afraid if Spot is missing. He paints and creates music and art that reflects his experience. Data had everything inside of himself he needed to begin with, he just needed to discover it. Data, was an example to the rest of us. At least in TNG. In the movies he was a crazy person. But so was everyone else.

By saltcured 2025-06-0316:15

He's just Spock 2.0... no emotions or suddenly too many, and he's even got the evil twin.

By jacobgkau 2025-06-0222:34

> The writers were always adamant that Data had no emotions... but quite often made faces reflecting the mood of the room or indicating he understood jokes made by other crew members.

This doesn't seem too different from how our current AI chatbots don't actually understand humor or have emotions, but can still explain a joke to you or generate text with a humorous tone if you ask them to based on samples, right?

> "Hey computer, we're all about to die, solve the plot for us so we make it to next week's episode!"

I'm curious, do you recall a specific episode or two that reflect what you feel boiled down to this?

By gdubs 2025-06-0222:23

Thanks, love this – it's something I've thought about as well!

By d_burfoot 2025-06-031:217 reply

It's a radical change in human/computer interface. Now, for many applications, it is much better to present the user with a simple chat window and allow them to type natural language into it, rather than ask them to learn a complex UI. I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

By bccdee 2025-06-031:553 reply

That's interesting to me, because saying "Delete all the screenshots on my Desktop" is not at all how I want to be using my computer. When I'm getting breakfast, I don't instruct the banana to "peel yourself and leap into my mouth," then flop open my jaw like a guppy. I just grab it and eat it. I don't want to tell my computer to delete all the screenshots (except for this or that that particular one). I want to pull one aside, sweep my mouse over the others, and tap "delete" to vanish them.

There's a "speaking and interpreting instructions" vibe to your answer which is at odds with my desire for an interface that feels like an extension of my body. For the most part, I don't want English to be an intermediary between my intent and the computer. I want to do, not tell.

By 20after4 2025-06-034:051 reply

> I want to do, not tell.

This 1000%.

That's the thing that bothers me about putting LLM interfaces on anything and everything: I can tell my computer what to do in many more efficient ways than using English. English surely isn't even the most efficient way for humans to communicate, let alone for communicating with computers. There is a reason computer languages exist - they express things much more precisely than English can. Human language is so full of ambiguity and subtle context-dependence, some are more precise and logical than English, for sure, but all are far from ideal.

I could either:

A. Learn to do a task well, after some practice, it becomes almost automatic. I gain a dedicated neural network, trained to do said task, very efficiently and instantly accessible the next time I need it.

Or:

B. Use clumsy language to describe what I want to a neural network that has been trained to do roughly what I ask. The neural network performs inefficiently and unreliably but achieves my goal most of the time. At best this seems like a really mediocre way to do a lot of things.

By lechatonnoir 2025-06-035:33

I basically agree, but with the caveat that the tradeoff is the opposite for a bunch of tedious things that I don't want to invest time into getting better at, or which maybe I only do rarely.

By creata 2025-06-032:21

This. Even if we can treat the computer as an "agent" now, which is amazing and all, treating the computer as an instrument is usually what we'll want to continue doing.

By skydhash 2025-06-032:202 reply

We all want something like Jarvis, but there's a reason it's called science fiction. Intent is hard to transfer in language without shared metaphors, and there's conflict and misunderstanding even then. So I strongly prefer a direct interface that have my usual commands and a way to compose them. Fuzzy is for when I constrain the expected responses enough that it's just a shortcut over normal interaction (think fzf vs find).

By underwater 2025-06-035:14

Do we? For commanding use cases articulating the action into English can feel more difficult than just doing it. Direct manipulation feels more primal to me.

By fragmede 2025-06-033:271 reply

Genuine question, which part of Jarvis is still science fiction? Interacting with a flying suit of armor powered by a fictional pseudo-infinite power source, as are the robots, and the fighting aliens & supervillains, but as far as having a robot companion like the movie "Her", that you can talk with about your problems, ChatGPT is already there. People have customized their ChatGPT through the use of the memories feature, given it a custom name, and tuned how they want it to respond; sassy/sweet/etc, how they want it to refer to them. they'll have conversations with it about whatever. It can go and search the Internet for stuff. Other than using it to manipulate a flying suit of armor which doesn't exist, to fight aliens, efficient the jury's still out on, which parts are there that are still science fiction? I'm assuming there's a big long list of things, I'm just not at all well versed in the lore enough to have a list of things that genuinely still seem impossible and which seem like just an implementation detail that someone probably already has an MCP for.

By skydhash 2025-06-033:571 reply

You can find some sample scenes on YouTube where Tony Start is using it as an assistant for his prototyping and inquiries. Jarvis is the executor and Stark is the idea man and reviewer. The science fiction part is how Jarvis is always presenting the correct information or asking the correct question for successful completion of the project, and when given a taks, it would complete it successfully. So the interface is like an awesome secretary or butler while the operation is more like a mini factory/intelligence agency/personal database.

By HappMacDonald 2025-06-034:431 reply

"If you douse me again, and I'm not on fire, I'm donating you to a city college."

By bytehowl 2025-06-038:171 reply

That was aimed at Dum-E, not Jarvis.

By HappMacDonald 2025-06-0318:32

The scifi tech is the same though, and demonstrates that this tech also gets confused.

By techpineapple 2025-06-032:22

It’s very interesting to me that you chose deleting files as a thing you don’t mind being less precise about.

By creata 2025-06-032:241 reply

I personally can't see this example working out. I'll always want to get some kind of confirmation of which files will be deleted, and at that point, just typing the command out is much easier than reading.

By Workaccount2 2025-06-034:44

You can just ask it to undelete what you want back. Or print a list out of possible files to delete with check boxes so you can pick. Or one-by-one prompt you. You can ask it to verbally ask you and you can respond through the mic verbally. Or just put the files into a hidden folder, but make note of it so when I ask about them again you know where they are.

Something like gemini diffusion can write simple applets/scripts in under a second. So your options are enormous for how to handle those deletions. Hell if you really want you can ask it to make your a pseudo terminal that lets you type in the old linux commands to remove them if you like.

Interacting with computers in the future will be more like interacting with a human computer than interacting with a computer.

By clocker 2025-06-032:22

> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

Both are valid cases, but one cannot replace the other—just like elevators and stairs. The presence of an elevator doesn't eliminate the need for stairs.

By ofrzeta 2025-06-036:022 reply

> I want to be able to say "Delete all the screenshots on my Desktop", instead of going into a terminal and typing "rm ~/Desktop/*.png".

But why? It takes many more characters to type :)

By mrighele 2025-06-037:312 reply

Because with the above command your assistant will delete snapshot-01.png and snapshot-02.jpeg, and avoid deleting by mistake my-kids-birthday.png

By sensanaty 2025-06-038:26

Will it? With how they work I find it more likely to run a sudo rm -rf /* than anything else.

By GenshoTikamura 2025-06-038:40

ChatGPT has just told me you should rather do `rm ~/Desktop/snapshot*.jpeg` in this case. I'm so impressed with this new shiny AI tech, I'd never be able to figure that out on my own!

By Disposal8433 2025-06-033:072 reply

The junior will repeatedly ask the AI to delete the screenshots. Until he forgets what is the command to delete a file.

The engineer will wonder why his desktop is filled his screenshots, change the settings that make it happen, and forget about it.

That behavior happened for years before AI, but AI will make that problem exponentially worse. Or I do hope that was a bad example.

By jaredsohn 2025-06-033:52

Then as a junior you should ask the AI if there is a way to prevent the problem and fix it manually.

You might then argue that they don't know they should ask that; could just configure the AI once to say you are a junior engineer and when you ask the ai to do something, you also want it to help you learn how to avoid problems and prevent them from happening.

By calvinmorrison 2025-06-0312:06

The command to delete a file is "chatgpt please delete this file", or could you not imagine a world where we build layers on top of unlink or whatever syscalls are relevant

By Workaccount2 2025-06-034:311 reply

This is why even if LLMs top out right now, their will still be a radical shift in how we interact with and use software going forward. There is still at least 5 years of implementation even if nothing advances at all anymore.

No one is ever going to want to touch a settings menu again.

By tsimionescu 2025-06-035:081 reply

> No one is ever going to want to touch a settings menu again.

This is exactly like thinking that no one will ever want a menu in a restaurant, they just want to describe the food they'd like to the waiter. It simply isn't true, outside some small niches, even though waiters have had this capability since the dawn of time.

By Workaccount2 2025-06-035:181 reply

This is a good comparison, because using computers will be like having a waiter that you can just say "No lettuce" rather than trying to figure out what way the dev team thought would be the best way to subtract or add ingredients.

By olddustytrail 2025-06-0418:02

You said: "no, lettuce"

"Ok, a bowl of lettuce. That's a great, healthy choice!"

By Velorivox 2025-06-032:281 reply

For me this moment came when Google calendar first let you enter fuzzy text to get calendar events added, this was around 2011, I think. In any case, for the end user this can be made to happen even when the computer cannot actually handle fuzzy inputs (which is of course, how an LLM works).

The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do. I remember people behaving like that around stocks not that long ago…

By 0points 2025-06-037:401 reply

> The big change with LLMs seems to be that everyone now has an opinion on what programming/AI is and can do

True, but I think this is just the zeitgeist. People today want to share their dumb opinions about any complex subject after they saw a 30 second reel.

By Velorivox 2025-06-0314:471 reply

What will it take to get people to admit they don’t actually know what they’re talking about?

The answer to that question lies at the bottom of a cup of hemlock.

By 0points 2025-06-048:10

Well, for once if the LLM:s they interact with or heck even the influencers they listen to would lead the way.

I'll be happy the day the LLM says "I don't know".

By cosmic_cheese 2025-06-0221:232 reply

Though I haven’t embraced LLM codegen (except for non-functional filler/test data), the fuzziness is why I like to use them as talking documentation. It makes for a lot less of fumbling around in the dark trying to figure out the magic combination of search keywords to surface the information needed, which can save a lot of time in aggregate.

By pixl97 2025-06-0221:301 reply

Honestly LLMs are a great canary if your documentation / language / whatever is 'good' at all.

I wish I would have kept it around but had ran into an issue where the LLM wasn't giving a great answer. Look at the documentation, and yea, made no sense. And all the forum stuff about it was people throwing out random guessing on how it should actually work.

If you're a company that makes something even moderately popular and LLMs are producing really bad answers there is one of two things happening.

1. Your a consulting company that makes their money by selling confused users solutions to your crappy product 2. Your documentation is confusing crap.

By NooneAtAll3 2025-06-0222:46

(you're)

By skydhash 2025-06-032:25

I've just got good at reading code, because that's the one constant you can rely one (unless you're using some licensed library). So whenever the reference is not enough, I just jump straight to the code (one of my latest examples is finding out that opendoas (a sudo replacement) hard code the persist option for not asking password to 5 minutes).

By wvenable 2025-06-030:39

I literally pasted these two lines into ChatGPT that were sent to me by one of sysadmin and it told me exactly what I needed to know:

    App1: requestedAccessTokenVersion": null
    App2: requestedAccessTokenVersion": 2

I use it like that all time. In fact, I'm starting to give it less and less context and just toss stuff at it. It's more efficient use of my time.

By vb-8448 2025-06-0313:53

In my opinion, most of the problems we see now with LLMs come from being fuzzy ... I'm used to getting very good code from claude o gemini (copy and paste without any changes that just works) but I have to be very specific, sometime it takes longer to write the prompt than writing the code itself.

If I'm fuzzy, the output quality is usually low and I need several iterations before getting an acceptable result.

At some point, in the future, there will be some kind of formalization on how to ask swe question to llms ... and we will get another programming language to rule the all :D

By TwoFerMaggie 2025-06-0315:26

It invalidates this CinemaSins nitpick on Alien completely

https://youtu.be/dJtYDb7YaJ4?si=5NuoXaW0pkGoBSJu&t=76

By rullelito 2025-06-037:46

To me this is the best thing about LLMs.

By jumploops 2025-06-033:55

Computers finally work they way they were always supposed to work :)

By grishka 2025-06-030:442 reply

But when I'm doing my job as a software developer, I don't want to be fuzzy. I want to be exact at telling the computer what to do, and for that, the most efficient way is still a programming language, not English. The only place where LLMs are an improvement is voice assistants. But voice assistants themselves are rather niche.

By dyauspitr 2025-06-033:121 reply

I want to be fuzzy and I want the LLM to generate something exact.

By kennyloginz 2025-06-035:531 reply

Is this sarcasm? I can’t tell anymore. Unless your ideas aren’t new, this is just impossible.

By dyauspitr 2025-06-0318:55

Why? I want the LLM to understand my intent and build something exact. That already happens many times.

By robryan 2025-06-039:49

It can get you 80% of the way there, you can still be exacting in telling it where it went wrong or fine tuning the result by hand.

By Barrin92 2025-06-0221:444 reply

>simple fact that you can now be fuzzy with the input you give a computer, and get something meaningful in return

I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want. Worth reading Dijkstra, who anticipated this, and the foolishness of it, half a century ago

"Instead of regarding the obligation to use formal symbols as a burden, we should regard the convenience of using them as a privilege: thanks to them, school children can learn to do what in earlier days only genius could achieve. (This was evidently not understood by the author that wrote —in 1977— in the preface of a technical report that "even the standard symbols used for logical connectives have been avoided for the sake of clarity". The occurrence of that sentence suggests that the author's misunderstanding is not confined to him alone.) When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.[...]

It may be illuminating to try to imagine what would have happened if, right from the start our native tongue would have been the only vehicle for the input into and the output from our information processing equipment. My considered guess is that history would, in a sense, have repeated itself, and that computer science would consist mainly of the indeed black art how to bootstrap from there to a sufficiently well-defined formal system. We would need all the intellect in the world to get the interface narrow enough to be usable"

Welcome to prompt engineering and vibe coding in 2025, where you have to argue with your computer to produce a formal language, that we invented in the first place so as to not have to argue in imprecise language

https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

By vector_spaces 2025-06-0222:021 reply

right: we don't use programming languages instead of natural language simply to make it hard. For the same reason, we use a restricted dialect of natural language when writing math proofs -- using constrained languages reduces ambiguity and provides guardrails for understanding. It gives us some hope of understanding the behavior of systems and having confidence in their outputs

There are levels of this though -- there are few instances where you actually need formal correctness. For most software, the stakes just aren't that high, all you need is predictable behavior in the "happy path", and to be within some forgiving neighborhood of "correct".

That said, those championing AI have done a very poor job at communicating the value of constrained languages, instead preferring to parrot this (decades and decades and decades old) dream of "specify systems in natural language"

By dboreham 2025-06-032:23

Algebraic notation was a feature that took 1000+ years to arrive at. Beforehand mathematics was described in natural language. "The square on the hypotenuse..." etc.

By gdubs 2025-06-0222:251 reply

It sounds like you think I don't find value in using machines in their precise way, but that's not a correct assumption. I love code! I love the algorithms and data structures of data science. I also love driving 5-speed transmissions and shooting on analog film – but it isn't always what's needed in a particular context or for a particular problem. There are lots of areas where a 'good enough solution done quickly' is way more valuable than a 100% correct and predictable solution.

By skydhash 2025-06-032:29

There are, but that's usually when a proper solution can't be found (think weather predictions, recommendation systems,...) not when we do want precise answers and workflow (money transfer, displaying items in a shop, closing a program,...).

By thom 2025-06-037:23

That’s interesting. I got into computing because unlike school where wrong answers gave you indelible red ink and teachers had only finite time for questions, computers were infinitely patient and forgiving. I could experiment, be wrong, and fix things. Yes I appreciated that I could calculate precise answers but it was much more about the process of getting to those answers in an environment that encouraged experimentation. Years later I get huge value from LLMs, where I can ask exceedingly dumb questions to an indefatigable if slightly scatterbrained teacher. If I were smart enough, like Dijkstra, to be right first time about everything, I’d probably find them less useful, but sadly I need cajoling along the way.

By PeterHolzwarth 2025-06-031:291 reply

"I got into this profession precisely because I wanted to give precise instructions to a machine and get exactly what I want."

So you didn't get into this profession to be lead then eh?

Because essentially, that's what Thomas in the article is describing (even if he doesn't realize it). He is a mini-lead with a team of a few junior and lower-mid-level engineers - all represented by LLM and agents he's built.

By plorkyeran 2025-06-033:23

Yes, correct. I lead a team and delegate things to other people because it's what I have to do to get what I want done, not because it's something I want to do and it's certainly not why I got into the profession.

By progval 2025-06-0221:243 reply

The other side of the coin is that if you give it a precise input, it will fuzzily interpret it as something else that is easier to solve.

By lechatonnoir 2025-06-035:31

Well said, these things are actually in a tradeoff with each other. I feel like a lot of people somehow imagine that you could have the best of both, which is incoherent short of mind-reading + already having clear ideas in the first place.

But thankfully we do have feedback/interactiveness to get around the downsides.

By pessimizer 2025-06-0222:33

When you have a precise input, why give it to an LLM? When I have to do arithmetic, I use a calculator. I don't ask my coworker, who is generally pretty good at arithmetic, although I'd get the right answer 98% of the time. Instead, I use my coworker for questions that are less completely specified.

Also, if it's an important piece of arithmetic, and I'm in a position where I need to ask my coworker rather than do it myself, I'd expect my coworker (and my AI) to grab (spawn) a calculator, too.

By BoorishBears 2025-06-0221:271 reply

It will, or it might? Because if every time you use an LLM is misinterprets your input as something easier to solve, you might want to brush up on the fundamentals of the tool

(I see some people are quite upset with the idea of having to mean what you say, but that's something that serves you well when interacting with people, LLMs, and even when programming computers.)

By progval 2025-06-0221:332 reply

Might, of course. And in my experience it's what happens most times I ask a LLM to do something I can't trivially do myself.

By BoorishBears 2025-06-0221:491 reply

Well everyone's experience is different, but that's been a pretty atypical failure mode in my experience.

That being said, I don't primarily lean on LLMs for things I have no clue how to do, and I don't think I'd recommend that as the primary use case either at this point. As the article points out, LLMs are pretty useful for doing tedious things you know how to do.

Add up enough "trivial" tasks and they can take up a non-trivial amount of energy. An LLM can help reduce some of the energy zapped so you can get to the harder, more important, parts of the code.

I also do my best to communicate clearly with LLMs: like I use words that mean what I intend to convey, not words that mean the opposite.

By jacobgkau 2025-06-0222:221 reply

I use words that convey very clearly what I mean, such as "don't invent a function that doesn't exist in your next response" when asking what function a value is coming from. It says it understands, then proceeds to do what I specifically asked it not to do anyway.

The fact that you're responding to someone who found AI non-useful with "you must be using words that are the opposite of what you really mean" makes your rebuttal come off as a little biased. Do you really think the chances of "they're playing opposite day" are higher than the chances of the tool not working well?

By BoorishBears 2025-06-0222:302 reply

But that's exactly what I mean by brush up on the tool: "don't invent a function that doesn't exist in your next response" doesn't mean anything to an LLM.

It implies you're continuing with a context window where it already hallucinated function calls, yet your fix is to give it an instruction that relies on a kind of introspection it can't really demonstrate.

My fix in that situation would be to start a fresh context and provide as much relevant documentation as feasible. If that's not enough, then the LLM probably won't succeed for the API in question no matter how many iterations you try and it's best to move on.

> ... makes your rebuttal come off as a little biased.

Biased how? I don't personally benefit from them using AI. They used wording that was contrary to what they meant in the comment I'm responding to, that's why I brought up the possibility.

By jacobgkau 2025-06-0222:411 reply

> Biased how?

Biased as in I'm pretty sure he didn't write an AI prompt that was the "opposite" of what he wanted.

And generalizing something that "might" happen as something that "will" happen is not actually an "opposite," so calling it that (and then basing your assumption of that person's prompt-writing on that characterization) was a stretch.

By BoorishBears 2025-06-032:291 reply

This honestly feels like a diversion from the actual point which you proved: for some class of issues with LLMs, the underlying problem is learning how to use the tool effectively.

If you really need me to educate you on the meaning of opposite...

"contrary to one another or to a thing specified"

"diametrically different (as in nature or character)"

Are two relevant definitions here.

Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other. A concept can (and often will) have multiple opposites.

But again, I'm not even holding them to that literal of a meaning.

If you told me even half the time you use an LLM the result is that it solves a completely different but simpler version of what you asked, my advice would still be to brush up on how to work with LLMs before diving in.

I'm really not sure why that's such a point of contention.

By jacobgkau 2025-06-041:511 reply

> Saying something will 100% happen, and saying something will sometimes happen are diametrically opposed statements and contrary to each other.

No. Saying something will 100% happen and saying something will 100% not happen are diametrically opposed. You can't just call every non-equal statement "diametrically opposed" on the basis that they aren't equal. That ignores the "diametrically" part.

If you wanted to say "I use words that mean what I intend to convey, not words that mean something similar," that would've been fair. Instead, you brought the word "opposite" in, misrepresenting what had been said and suggesting you'll stretch the truth to make your point. That's where the sense of bias came from. (You also pointlessly left "what I intend to convey" in to try and make your argument appear softer, when the entire point you're making is that "what you intend" isn't good enough and one apparently needs to be exact instead.)

By BoorishBears 2025-06-0418:431 reply

This word soup doesn't get to redefine the word opposite, but you're free to keep trying.

Cute that you've now written at least 200 words trying to divert the conversation though, and not a single word to actually address your demonstration of the opposite of understanding how the tools you use work.

By jacobgkau 2025-06-189:17

The entire premise of my first reply to you was that your hyperbole invalidated your position. If either of us diverted the conversation, it was you.

One of your replies to me included the statement "the LLM probably won't succeed for the API in question no matter how many iterations you try and it's best to move on" (i.e. don't do the work or don't use AI to do it). Yet you continue to repeat that it's my (and everyone else's) lack of understanding that's somehow the problem, not conceding that AI being unable to perform certain tasks is a valid point of skepticism.

> This word soup doesn't get to redefine the word opposite,

You're the one trying to redefine the word "opposite" to mean "any two things that aren't identical."

By lechatonnoir 2025-06-035:29

Well said about the fact that they can't introspect, and I agree with your tip about starting with fresh context, and about when to give up.

I feel like this thread is full of strawmen from people who want to come up with reasons they shouldn't try to use this tool for what it's good at, and figure out ways to deal with the failure cases.

By khasan222 2025-06-0222:30

I find this very very much depends on the model and instructions you give the llm. Also you can use other instructions to check the output and have it try again. Definitely with larger codebases it struggles but the power is there.

My favorite instruction is using component A as an example make component B

By Corey_ 2025-06-039:17

[dead]

By few 2025-06-032:122 reply

>On two occasions, I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able to rightly apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage

This quote did not age well

By snowwrestler 2025-06-035:17

Now with LLMs, you can put in the right figures and the wrong answers might come out.

By guelo 2025-06-032:16

not if you consider how confused our ideas are today

By dogcomplex 2025-06-037:28

If anything we now need to unlearn the rigidity - being too formal can make the AI overly focused on certain aspects, and is in general poor UX. You can always tell legacy man-made code because it is extremely inflexible and requires the user to know terminology and usage implicitly lest it break, hard.

For once, as developers we are actually using computers how normal people always wished they worked and were turned away frustratedly. We now need to blend our precise formal approach with these capabilities to make it all actually work the way it always should have.