It’s a bit strange how anecdotes have become acceptable fuel for 1000 comment technical debates.
I’ve always liked the quote that sufficiently advanced tech looks like magic, but its mistake to assume that things that look like magic also share other properties of magic. They don’t.
Software engineering spans over several distinct skills: forming logical plans, encoding them in machine executable form(coding), making them readable and expandable by other humans(to scale engineering), and constantly navigating tradeoffs like performance, maintainability and org constraints as requirements evolve.
LLMs are very good at some of these, especially instruction following within well known methodologies. That’s real progress, and it will be productized sooner than later, having concrete usecases, ROI and clearly defined end user.
Yet, I’d love to see less discussion driven by anecdotes and more discussion about productizing these tools, where they work, usage methodologies, missing tooling, KPIs for specific usecases. And don’t get me started on current evaluation frameworks, they become increasingly irrelevant once models are good enough at instruction following.
We should get back to the basic definition of the engineering job. An engineer understands requirements, translates them into logical flows that can be automated, communicates tradeoffs across the organization, and makes tradeoff calls on maintainability, extensibility, readability, and security. Most importantly, they’re accountable for the outcome, because many tradeoffs only reveal their cost once they hit reality
None of this is covered by code generation, nor by juniors submitting random PRs. Those are symptoms of juniors (not only) missing fundamentals. When we forget what the job actually is, we create misalignment with junior engineers and end up with weird ideas like "spec-driven development"
If anything, coding agents are a wake-up call that clarify what engineering profession is really about
Skills are a pretty awkward abstraction. They emerged to patch a real problem, generic models require fine-tuning via context, which quickly leads to bloated context files and context dilution (ie more hallucinations)
But skills dont really solve the problem. Turning that workaround into a standard feels strange. Standardizing a patch isn’t something I’d expect from Anthropic, it’s unclear what is their endgame here
LLMs get over-analyzed. They’re predictive text models trained to match patterns in their data, statistical algorithms, not brains, not systems with “psychology” in any human sense.
Agents, however, are products. They should have clear UX boundaries: show what context they’re using, communicate uncertainty, validate outputs where possible, and expose performance so users can understand when and why they fail.
IMO the real issue is that raw, general-purpose models were released directly to consumers. That normalized under-specified consumer products, created the expectation that users would interpret model behavior, define their own success criteria, and manually handle edge cases, sometimes with severe real world consequences.
I’m sure the market will fix itself with time, but I hope more people would know when not to use these half baked AGI “products”