Hacker News

Ask HN: Do you have any evidence that agentic coding works?

2026-01-2012:45461455

I've been trying to get agentic coding to work, but the dissonance between what I'm seeing online and what I'm able to achieve is doing my head in.

Is there real evidence, beyond hype, that agentic coding produces net-positive results? If any of you have actually got it to work, cou...

Show article

I've been trying to get agentic coding to work, but the dissonance between what I'm seeing online and what I'm able to achieve is doing my head in.

Is there real evidence, beyond hype, that agentic coding produces net-positive results? If any of you have actually got it to work, could you share (in detail) how you did it?

By "getting it to work" I mean: * creating more value than technical debt, and * producing code that’s structurally sound enough for someone responsible for the architecture to sign off on.

Lately I’ve seen a push toward minimal or nonexistent code review, with the claim that we should move from “validating architecture” to “validating behavior.” In practice, this seems to mean: don’t look at the code; if tests and CI pass, ship it. I can’t see how this holds up long-term. My expectation is that you end up with "spaghetti" code that works on the happy path but accumulates subtle, hard-to-debug failures over time.

When I tried using Codex on my existing codebases, with or without guardrails, half of my time went into fixing the subtle mistakes it made or the duplication it introduced.

Last weekend I tried building an iOS app for pet feeding reminders from scratch. I instructed Codex to research and propose an architectural blueprint for SwiftUI first. Then, I worked with it to write a spec describing what should be implemented and how.

The first implementation pass was surprisingly good, although it had a number of bugs. Things went downhill fast, however. I spent the rest of my weekend getting Codex to make things work, fix bugs without introducing new ones, and research best practices instead of making stuff up. Although I made it record new guidelines and guardrails as I found them, things didn't improve. In the end I just gave up.

I personally can't accept shipping unreviewed code. It feels wrong. The product has to work, but the code must also be high-quality.

terabytest

Karma: 2927

@Hacker__News
@hacker._news

Comments

By xsh6942 2026-01-219:556 reply

It really depends by what you mean by "it works". A retrospective of the last 6months.

I've had great success coding infra (terraform). It at least 10x the generation of easily verifiable and tedious to write code. Results were audited to death as the client was highly regulated.

Professional feature dev is hit and miss for sure, although getting better and better. We're nowhere near full agentic coding. However, by reinvesting the speed gains from not writing boilerplate into devex and tests/security, I bring to life much better quality software, maintainable and a boy to work with.

I suddenly have the homelab of my dreams, all the ideas previously in the "too long to execute" category now get vibe coded while watching TV or doing other stuff.

As an old jaded engineer, everything code was getting a bit boring and repetitive (so many rest APIs). I guess you get the most value out of it when you know exactly what you want.

Most importantly though, and I've heard this from a few other seniors: I've found joy in making cool fun things with tech again. I like that new way of creating stuff at the speed of thought, and I guess for me that counts as "it works"

By raphaelj 2026-01-2111:07

Same experience here.

On some tasks like build scripts, infra and CI stuff, I am getting a significant speedup. Maybe I am 2x faster on these tasks, when measured from start to PR.

I am working on a HPC project[1] that requires more careful architectural thinking. Trying to let the LLM do the whole task most often fail, or produce low quality code (even with top models like Opus 4.5).

What works well though is "assisted" coding. I am usually writing the interface code (e.g. headers in C++) with some help from the agent, and then let the LLM do the actual implementation of these functions/methods. Then I do final adjustments. Writing a good AGENTS.md helps a lot. I might be 30% faster on these tasks.

It seems to match what I see from the PRs I am reviewing: we are getting these slightly more often than before.

---

[1] https://github.com/finos/opengris-scaler

By BrandoElFollito 2026-01-2110:181 reply

> I guess you get the most value out of it when you know exactly what you want.

Oh yes. I am amateur-developping for 35 years and when I vibe code I let the basic, generic stuff happen and then tell the AI to refactor the way I want. It usually works.

I had the same "too boring to code" approach and AI was a revelation. It takes off the typing but allows, when used correctly, for the creative part. I love this.

By spopejoy 2026-01-2115:45

The OP question was about agentic utility specifically. I've also gotten great side-project utility from AI codegen without having to marry my project to CC or give up on looking at code by simply prompting when I need something from whatever LLM.

Nothing wrong with CC, but I keep hearing the same kind of app being built -- home automation, side-project CRUD.

What I'm deeply skeptical of is the ability for agentic to integrate with a team maintaining+shipping a critical offering. If you're using LLMs for one-off PRs, great but then agentic seems like a band aid for memory etc.

Meamwhile if you're full CC/agentic it seems like a team would get out of sync.

By theshrike79 2026-01-2111:321 reply

> I suddenly have the homelab of my dreams, all the ideas previously in the "too long to execute" category now get vibe coded while watching TV or doing other stuff.

This is the true game changer.

I have a large-ish NAS that's not very well organised (I'm trying, it's a consolidated mess of different sources from two deacades - at least they're all in the same place now)

It was faster to ask Claude to write me a search database backend + frontend than try to click through the directories and wait for the slow SMB shares to update to find where that one file was I knew was in there.

Now I have a Go backend that crawls my NAS every night, indexes files to a FTS5 sqlite database with minimal metadata (size + mimetype + mtime/ctime) and a simple web frontend I can use to query the database

...actually I kinda want a cli search tool that uses the same schema. Brb.

Done.

AI might be a bubble etc. but I'll still have that search tool (and two dozen other utilities) in 5 years when Claude monthly subsciption is 2000€ and a right to harvest your organs on non-payment.

By martinosis 2026-01-2317:411 reply

This is exactly where LLMS shines, but when you get to a larger project,for me everything falls apart since most of the time the application gets way to complex because the LLM try to guess what you want. This is ok for small project but quite bad for larger ones.

By theshrike79 2026-01-2317:58

Depends on so many things. Like the definition of “large” and what you’re asking the LLM to do and how the project is set up for LLM use.

It doesn’t need to guess if it has the tools and documentation available.

By donw 2026-01-2110:09

Same here. You have to slice things small enough for the agent to execute effectively, but beyond that, it’s magic.

By hahahahhaah 2026-01-262:53

Terraform is a great use case:

* Unrefactorable and highly boilerplatey

* Probably too big a job and low impact to rewrite as IaC

* AI can do all that tedious plumbing well

* Since result is a depoyment not executable code it suffices to check correct resources are created.

By andy_ppp 2026-01-2110:171 reply

I honestly find AI quite poor at writing good well thought through tests, potentially because:

1. writing testable code is part of writing good tests

2. testing is actually poorly done in all the training data because humans are also bad at writing tests

3. tests should be more focused around business logic and describing the application than arbitrarily testing things in an uncanny valley of AI slop

By theshrike79 2026-01-2111:401 reply

When Vibe coding/engineering I don't think of tests in the same way as when testing human written code.

I use unit tests to "lock down" current behavior so an agent rummaging around feature F doesn't break features A and B and will get immediate feedback if that happens.

I'm not trying to match every edge case, but focus more on end to end tests where input and output are locked golden files. "If this comes in, this exact thing must come out the other end." type of thing.

The AI can figure out what went wrong if the tests fail.

By andy_ppp 2026-01-2114:50

Yeah, I need to start accepting to some degree the world has changed - in the past when I want to understand a system I'd have read the tests, but with AI I can just ask cursor to explain what the code is doing and it's fairly good at explaining the functionality to me.

I'm not sure I feel truly comfortable yet with huge blocks of code that are not cleanly understood by humans but it's happening whether I like it or not.

By resonious 2026-01-216:436 reply

I think one fatal flaw is letting the agent build the app from scratch. I've had huge success with agents, but only on existing apps that were architected by humans and have established conventions and guardrails. Agents are really bad at architecture, but quite good at following suit.

Other things that seem to contribute to success with agents are:

- Static type systems (not tacked-on like Typescript)

- A test suite where the tests cover large swaths of code (i.e. not just unit testing individual functions; you want e2e-style tests, but not the flaky browser kind)

With all the above boxes ticked, I can get away with only doing "sampled" reviews. I.e. I don't review every single change, but I do review some of them. And if I find anything weird that I had missed from a previous change, I to tell it to fix it and give the fix a full review. For architectural changes, I plan the change myself, start working on it, then tell the agent to finish.

By BatteryMountain 2026-01-219:11

C# works great for agents but it works due to established patterns, strict compiler & strong typing, compiler flag for "Treat Warnings as Errors", .editorconfig with many rules and enforcement of them. You have to tell it to use async where possible, to do proper error handling and logging, xml comments above complex methods and so on. It works really well once you got it figured out. It also helps to give it separate but focussed tasks, so I have a todo.txt file that it can read to keep track of tasks. Basically you have to be strict with it. I cannot imagine how people trust outputs for python/javascript as there are no strong typing or compilers involved, maybe some linting rules that can save you. Maybe Typescript with strict mode can work but then you have to be a purest about it and watch it like a hawk, which will drain you fast. C# + claude code works really well.

By tcgv 2026-01-2112:56

Upvote.

That's my experience too. Agent coding works really well for existing codebases that are well-structured and organized. If your codebase is mostly spaghetti—without clear boundaries and no clear architecture in place—then agents won't be of much help. They'll also suffer working in those codebases and produce mediocre results.

Regarding building apps and systems from scratch with agents, I also find it more challenging. You can make it work, but you'll have to provide much more "spec" to the agent to get a good result (and "good" here is subjective). Agents excel at tasks with a narrower scope and clear objectives.

The best use case for coding agents is tasks that you'd be comfortable coding yourself, where you can write clear instructions about what you expect, and you can review the result (and even make minor adjustments if necessary before shipping it). This is where I see clear efficiency gains.

By nl 2026-01-218:405 reply

Typescript is a great type system for agents to use. It's expressive and the compiler is much faster than rust, so turn around is much quicker.

I'm slowly accepting that Python's optional typing is mistake with AI agents, especially with human coders too. It's too easy for a type to be wrong and if someone doesn't have typechecking turned on that mistake propagates.

By maleldil 2026-01-2113:42

> I'm slowly accepting that Python's optional typing is mistake with AI agents

Don't make it optional, then. Use pyright or mypy in strict mode. Make it part of your lint task, have the agent run lint often, forbid it from using `type: ignore`, and review every `Any` and `cast` usage.

If you're using CI, make a type error cause the job to fail.

It's not the same as using a language with a proper type system (e.g. Rust), but it's a big step in the right direction.

By davidfstr 2026-01-219:07

You should not be using Python types without a type checker in use to enforce them.

With a type checker on, types are fantastic for catching missed cases early.

By K0IN 2026-01-219:04

Same for typescript, by default you still got `any`, best case (for humans and LLM) is a strict linter that will give you feedback on what is wrong. But then (and I saw this a couple times with non-experienced devs), you or the AI has to know it. Write a strict linter config, use it, and as someone with not that much coding knowledge, you may be unfamiliar and thus not asking.

By resonious 2026-01-2111:30

Whenever I have an agent use Typescript, they always cast things to `any` and circumvent the types wherever convenient. And sometimes they don't even compile it - they just run it through Bun or similar.

I know I can configure tools and claude.md to fix this stuff but it's a drag when I could just use a language that doesn't have these problems to begin with.

By embedding-shape 2026-01-2111:13

> I'm slowly accepting that Python's optional typing is mistake with AI agents, especially with human coders too. It's too easy for a type to be wrong and if someone doesn't have typechecking turned on that mistake propagates.

How would you end up using types but not have any type checking? What's the point of the types?

By theshrike79 2026-01-2111:43

I've found Go to be the most efficient language with LLMs

The language is "small", very few keywords and hasn't changed much in a decade. It also has a built in testing system with well known patterns how to use it properly.

Along with robust linters I can be pretty confident LLMs can't mess up too badly.

They do tend to overcomplicate structures a bit and require a fresh context and "see if you can simplify this" or "make those three implement a generic interface" type of prompts to tear down some of the repetition and complexity - but again it's pretty easy with a simple language.

By barnabee 2026-01-2117:26

I’m currently experimenting (alongside working as usual) with a reasonably non-trivial rust project that will be designed “project managed”[0], built, and tested by LLM agents (mostly Claude, via OpenCode) based on me providing high level requirements and then prompting it to complete things, as well as course correcting (rule: I don’t edit the code, specifications, or tasks directly).

It’s too early to tell how it will work out but things are going better than I expected. It’s probably 20% built after a couple of days, in which I’ve mostly done other work, and it’s working for quite long periods without input from me.

When I do have to provide input, the prompt is often just “Continue working according to the project standards and rules”.

I have no idea if it’ll meet the requirements. I didn’t expect it to get this far, but a month or two ago I didn’t think the chances were high enough to even make it worth trying.

[0] I asked it to create additional documentation for project standards and rules to refer to only when needed (referenced from AGENTS.md). This included git workflow, maintaining a set of specifications, and an overall ROADMAP.md as well TASKS.md (detailed next steps from the roadmap) and STATUS.md (status of each of the tasks).

By malloryerik 2026-01-2112:541 reply

I've found good results with Clojure and Elixir despite them being dynamic and niche.

By spopejoy 2026-01-2115:521 reply

Not really production level or agentic, but I've been impressed with LLMs for Haskell.

I think that while these langs are "niche" they still have quality web resources and codebases available for training.

I worry about new languages though. I guess maybe model training with synthetic data will become a requirement?

By dysoco 2026-01-2117:541 reply

> I worry about new languages though. I guess maybe model training with synthetic data will become a requirement?

I read a (rather pessimistic) comment here yesterday claiming that the current generation of languages is most likely going to be the last, since the already existing corpus of code for training is going to trump any other possible feature the new language might introduce, and most of the code will be LLM generated anyways.

By malloryerik 2026-01-229:571 reply

I've wondered to myself here and there if new languages wouldn't be specifically written for LLM agentic coding, and what that might look like.

By spopejoy 2026-01-292:26

I had the thought of an AI-specific bytecode a while ago, but since then it's seemed a little silly -- the only langs that work well with agentic coding are the major ones with big open-source corpuses and SO/reddit discussions to train on.

I also saw something about a bytecode for prompts, which again seems to miss the point -- natural language is the win here.

What is kind of mysterious about the whole thing is that LLMs aren't compilers yet they grok code really well. It's always been a mystery to me that tools weren't smarter and then with LLMs the tooling became smarter than the compiler, and yet ... if it actually was a compiler we could choose to instruct it with code and get deterministic results. Something about the chaos is the very value they provide.

By defatigable 2026-01-211:588 reply

I use Augment with Claud Opus 4.5 every day at my job. I barely ever write code by hand anymore. I don't blindly accept the code that it writes, I iterate with it. We review code at my work. I have absolutely found a lot of benefit from my tools.

I've implemented several medium-scale projects that I anticipate would have taken 1-2 weeks manually, and took a day or so using agentic tools.

A few very concrete advantages I've found:

* I can spin up several agents in parallel and cycle between them. Reviewing the output of one while the others crank away.

* It's greatly improved my ability in languages I'm not expert in. For example, I wrote a Chrome extension which I've maintained for a decade or so. I'm quite weak in Javascript. I pointed Antigravity at it and gave it a very open-ended prompt (basically, "improve this extension") and in about five minutes in vastly improved the quality of the extension (better UI, performance, removed dependencies). The improvements may have been easy for someone expert in JS, but I'm not.

Here's the approach I follow that works pretty well:

1. Tell the agent your spec, as clearly as possible. Tell the agent to analyze the code and make a plan based on your spec. Tell the agent to not make any changes without consulting you.

2. Iterate on the plan with the agent until you think it's a good idea.

3. Have the agent implement your plan step by step. Tell the agent to pause and get your input between each step.

4. Between each step, look at what the agent did and tell it to make any corrections or modifications to the plan you notice. (I find that it helps to remind them what the overall plan is because sometimes they forget...).

5. Once the code is completed (or even between each step), I like to run a code-cleanup subagent that maintains the logic but improves style (factors out magic constants, helper functions, etc.)

This works quite well for me. Since these are text-based interfaces, I find that clarity of prose makes a big difference. Being very careful and explicit about the spec you provide to the agent is crucial.

By marcus_holmes 2026-01-218:051 reply

This. I use it for coding in a Rails app when I'm not a Ruby expert. I can read the code, but writing it is painful, and so having the LLM write the code is beneficial. It's definitely faster than if I was writing the code, and probably produces better code than I would write.

I've been a professional software developer for >30 years, and this is the biggest revolution I've seen in the industry. It is going to change everything we do. There will be winners and losers, and we will make a lot of mistakes, as usual, but I'm optimistic about the outcome.

By defatigable 2026-01-218:161 reply

Agreed. In the domains where I'm an expert, it's a nice productivity boost. In the domains where I'm not, it's transformative.

As a complete aside from the question of productivity, these coding tools have reawakened a love of programming in me. I've been coding for long enough that the nitty gritty of everyday programming just feels like a slog - decrypting compiler errors, fixing type checking issues, factoring out helper functions, whatever. With these tools, I get to think about code at a much higher level. I create designs and high level ideas and the AI does all the annoying detail work.

I'm sure there are other people for whom those tasks feel like an interesting and satisfying puzzle, but for me it's been very liberating to escape from them.

By ZitchDog 2026-01-2115:381 reply

> In the domains where I'm an expert, it's a nice productivity boost. In the domains where I'm not, it's transformative.

Is it possible that the code you are writing isn't good, but you don't know it because you're not an expert?

By defatigable 2026-01-2119:19

No, I'm quite confident that I'm very strong in these languages. Certainly not world-class but I write very good code and I know well-written code when I see it.

If you'd like some evidence, I literally just flipped a feature flag to change how we use queues to orchestrate workflows. The bulk of this new feature was introduced in a 1300-line PR, touching at least four different services, written in Golang and Python. It was very much AI agent driven using the flow I described. Enabling the feature worked the first time without a hiccup.

(To forestall the inevitable quibble, I am aware that very large PRs are against best practice and it's preferable to use smaller, stacked PRs. In this case for clarity purposes and atomicity of rollbacks I judged it preferable to use a single large PR.)

By jesse__ 2026-01-216:532 reply

> I've implemented several medium-scale projects that I anticipate would have taken 1-2 weeks manually

A 1-week project is a medium-scale project?! That's tiny, dude. A medium project for me is like 3 months of 12h days.

By defatigable 2026-01-217:101 reply

You are welcome to use whatever definition of "small/medium/large" you like. Like you, 1-2 weeks is also far from the largest project I've worked on. I don't think that's particularly relevant to the point of my post.

The point that I'm trying to emphasize is that I've had success with it on projects of some scale, where you are implementing (e.g.) multiple related PRs in different services. I'm not just using it on very tightly scoped tasks like "implement this function".

By jesse__ 2026-01-2117:201 reply

I mean, if it's working for you, great.

The observation I was trying to make is that at the scope of one week, there's very little you actually get done, and it's likely mostly mechanical work. Given that, I suppose I'm unsurprised LLMs are proving useful. Seems like that's the type of thing they're excelling at.

By defatigable 2026-01-2118:22

That's not my experience. I agree that a project of any real size takes quite a bit longer than a week. But it's composed of lots of, well, week or two long subprojects. And if the AI coding tool is condensing week long projects into a day, that's a huge benefit.

Concretely speaking (well as concretely as I feel like being without piercing pseudonymity), at my last job I worked on a multi year rewrite of one of our core services. Within that rewrite were ton of much smaller projects that were a few weeks to a month long - refactor this algorithm, improve the load balancing, add a new sharding strategy, etc. An AI tool would definitely not have sped up the whole process. It's not going to, say, speed up figuring out and handling intra-team dependencies or figuring out product design. But speeding up those smaller coding subprojects would have been a huge benefit.

I'm not making any strong claims in my post. I don't have the experience of AI projects allowing me to one shot large projects. But OP asked if anyone has concrete experience with AI coding tools speeding up development, and the answer is yes, I do.

By drewstiff 2026-01-219:05

Well a medium project for me takes 3 years, so obviously I am the best out of everyone /s

By monkeydust 2026-01-218:481 reply

1. And 2. I.e. creating a spec which is the source of truth (or spec driven development) is key to getting anything production grade from our experience.

By defatigable 2026-01-2114:37

Yes. This was the key thing I learned that let me set the agents loose on larger tasks. Before I started iterating on specs with them, I mostly had them doing very small scale, refactor-this-function style tasks.

The other advice I've read that I haven't yet internalized as much is to use an "adversarial" approach with the LLMs: i.e. give them a rigid framework that they have to code against. So, e.g., generate tests that the code has to work against, or sample output that the code has to perfectly match. My agents do write tests as part of their work, and I use them to verify correctness, but I haven't updated my flow to emphasize that the agents should start with those, and iterate on them before working on the main implementation.

By laserlight 2026-01-2112:411 reply

I wouldn't consider the proposed workflow agentic. When you review each step, give feedback after each step, it's simply development with LLMs.

By defatigable 2026-01-2114:311 reply

Interesting. What would make the workflow "agentic" in your mind? The AI implementing the task fully autonomously, never getting any human feedback?

To me "agentic" in this context essentially that the LLM has the ability to operate autonomously, so execute tools on my behalf, etc. So for example my coding agents will often run unit tests, run code generation tools, etc. I've even used my agents to fix issues with git pre-commit hooks, in which case they've operated in a loop, repeatedly trying to check in code and fixing errors they see in the output.

So in that sense they are theoretically capable of one-shot implementing any task I set them to, their quality is just not good enough yet to trust them to. But maybe you mean something different?

By laserlight 2026-01-2116:311 reply

IMHO, agentic workflow is the autonomous execution of a detailed plan. Back-and-forth between LLM and developer is fine in the planning stage. Then, the agent is supposed to overcome any difficulties or devise solutions to unplanned situations. Otherwise, Cursor had been able to develop in a tight loop of writing and running tests, followed by fixing bugs, before “agentic” became a buzzword.

Perhaps “agentic” initially referred to this simple loop, but the milestone was achieved so quickly that the meaning shifted. Regardless, I could be wrong.

By defatigable 2026-01-2118:59

Yeah, I have no idea what the consensus definition of the term is, and I suppose I can't say for sure what OP meant. I haven't used Cursor. My understanding was that it exercises IDE functions but does not execute arbitrary shell commands, maybe I'm wrong. I've specifically had good experiences with the tools being able to run arbitrary commands (like the git debugging example I mentioned).

In my experience reading discussions like this, people seem to be saying that they don't believe that Claude Code and similar tools provide much of a productivity boost on relatively open ended domains (i.e. the AI is driving the writing of the code, not just assisting you in writing your own code faster). And that's certainly not my experience.

I agree with you that success with the initial milestone ("agent operates in a self-contained loop and can execute arbitrary commands") was achieved pretty quickly. But in my experience a lot of people don't believe this. :-)

By mountainriver 2026-01-235:08

Same, Opus 4.5 is nothing short of amazing. I’m really shocked to see so many posts claiming it doesn’t work.

We write whole full scale Rust SaaS apps with few regressions.

I do novel machine learning research in about a 1/10 of the time it would have taken me.

A big thing is telling it to excessively log so it can see the execution

By tkgally 2026-01-213:081 reply

Great advice.

> Tell the agent your spec, as clearly as possible.

I have recently added a step before that when beginning a project with Claude Code: invoke the AskUserQuestionTool and have it ask me questions about what I want to do and what approaches I prefer. It helps to clarify my thinking, and the specs it then produces are much better than if I had written them myself.

I should note, though, that I am a pure vibe coder. I don't understand any programming language well enough to identify problems in code by looking at it. When I want to check whether working code produced by Claude might still contain bugs, I have Gemini and Codex check it as well. They always find problems, which I then ask Claude to fix.

None of what I produce this way is mission-critical or for commercial use. My current hobby project, still in progress, is a Japanese-English dictionary:

https://github.com/tkgally/je-dict-1

https://www.tkgje.jp/

By defatigable 2026-01-217:051 reply

Great idea! That's actually the very next improvement I was planning on making to my coding flow: building a sub agent that is purely designed to study the codebase and create a structured implementation plan. Every large project I work on has the same basic initial steps (study the codebase, discuss the plan with me, etc) so it makes sense to formalize this in an agent I specialize for the purpose.

By marcus_holmes 2026-01-218:081 reply

Is it just me, or does every post starting with "Great Idea!" or "Great point!" or "You're so right!" or similar just sound like an LLM is posting?

Or is this a new human linguistic tic that is being caused by prolonged LLM usage?

Or is it just me?

By defatigable 2026-01-218:201 reply

:-) I feel you. Perhaps I should have ended my post with "Would you like me to construct a good prompt for your planning agent?" to really drive us into the uncanny valley?

(My writing style is very dry and to the point, you may have noticed. I looked at my post and thought, "Huh, I should try and emotionally engage with this poster, we seem like we're having a shared experience." And so I figured, heck, I'll throw in an enthusiastic interjection. When I was in college, my friends told me I had "bonsai emotions" and I suppose that still comes through in my writing style...)

By marcus_holmes 2026-01-220:13

Excellent reply :) And yes, maybe that's it, that the LLM emotion feels forced so any forced emotion now feels like an LLM wrote it.

By TechDebtDevin 2026-01-214:39

[dead]

By solaris2007 2026-01-215:403 reply

[flagged]

By djmips 2026-01-218:06

"Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes"

By molteanu 2026-01-216:312 reply

That's a very good point.

The OP is "quite weak at JavaScript" but their AI "vastly improved the quality of the extension." Like, my dude, how can you tell? Does the code look polished, it looks smart, the tests pass, or what?! How can you come forward and be the judge of something you're not an expert in?

I mean, at this point, I'm beginning to be skeptical about half the content posted online. Anybody can come up with any damn story and make it credible. Just the other day I found out about reddit engagement bots, and I've seen some in the wild myself.

I'm waiting for the internet bubble to burst already so we can all go back to our normal lives, where we've left it 20 years or so ago.

By defatigable 2026-01-216:54

How can I tell? Yes, the code looks quite a bit more polished. I'm not expert enough in JS to, e.g., know the cleanest method to inspect and modify the DOM, but I can look at code that does and tell if the approach it's using is sensible or not. Surely you've had the experience of a domain where you can evaluate the quality of the end product, even if you can't create a high quality product on your own?

Concretely in this case, I'd implemented an approach that used jQuery listeners to listen for DOM updates. Antigravity rewrote it to an approach that avoided the jQuery dependency entirely, using native MutationObservers. The code is sensible. It's noticeably more performant than the approach I crafted by hand. Antigravity allowed me to easily add a number of new features to my extension that I would have found tricky to add by hand. The UI looks quite a bit nicer than before I used AI tools to update it. Would these enhancements have been hard for an expert in Chrome extensions to implement? Probably not. But I'm not that expert, and AI coding tools allowed me to do them.

That was not actually the main thrust of my post, it's just a nice side benefit I've experienced. In the main domain where I use coding tools, at work, I work in languages where I'm quite a bit more proficient (Golang/Python). There, the quality of code that the AI tools generate is not better than I write by hand. The initial revisions are generally worse. But they're quite a bit faster than I write by hand, and if I iterate with the coding tools I can get to implementations that are as good as I would write by hand, and a lot faster.

I understand the bias towards skepticism. I have no particular dog in this fight, it doesn't bother me if you don't use these tools. But OP asked for peoples' experiences so I thought I'd share.

By achierius 2026-01-216:392 reply

JavaScript isn't the only programming language around. I'm not the strongest around with JS either but I can figure it out as necessary -- knowing C/C++/Java/whatever means you can still grok "this looks better than that" for most cases.

By defatigable 2026-01-217:03

Yep. I have plenty of experience in languages that use C-style syntax, enough to easily understand code written in other languages that occur nearby in the syntactical family tree. I'm not steeped in JS enough to know the weird gotchas of the type system, or know the standard library well, etc. But I can read the code fine.

If I'd asked an AI coding tool to write something up for me in Haskell, I would have no idea if it had done a good job.

By esailija 2026-01-219:031 reply

I don't think so. Imagine it was vice versa, someone saying they knew JS and were weak at C/C++/Java.

By defatigable 2026-01-2114:48

This doesn't sound right to me. If someone who were expert in JS looked at a relatively simple C++ program, I think they could reasonably well tell if the quality of code were good or not. They wouldn't be able to, e.g., detect bugs from default value initialization, memory leaks, etc. But so long as the code didn't do any crazy templating stuff they'd be able to analyze it at a rough "this algorithm seems sensible" level".

Analogously I'm quite proficient at C++, and I can easily look at a small JS program and tell if it's sensible. But if you give me even a simple React app I wouldn't be able to understand it without a lot of effort (I've had this experience...)

I agree with your broad point: C/C++/Java are certainly much more complex than JS and I would expect someone expert in them to have a much easier time picking up JS than the reverse. But given very high overlap in syntax between the four I think anyone who's proficient in one can grok the basics of the others.

By defatigable 2026-01-216:381 reply

I've never had a job where writing Javascript has been the primary language (so far it's been C++/Java/Golang). The JS Chrome Extension is a fun side project. Using Augment in a work context, I'm primarily using it for Golang and Python code, languages where I'm pretty proficient but AI tools give me a decent efficiency boost.

I understand the emotional satisfaction of letting loose an easy snarky comment, of course, but you missed the mark I'm afraid.

By solaris2007 2026-01-2110:431 reply

[flagged]

By christophilus 2026-01-2111:421 reply

> If you are any good with those four languages, you are leagues ahead of anyone who does Javascript full time.

That is a priggish statement, and comes across as ignorant.

I’ve been paid to program in many different languages over the years. Typescript is what I choose for most tasks these days. I haven’t noticed any real difference between my past C#, C++, C, Java, Ruby, etc programming peers and my current JavaScript ones.

By solaris2007 2026-01-2112:22

> That is a priggish statement

A cursory glance at the definition of "prig" shows that what I wrote there is categorically not. You should at least try to look up that word and if you look it up and still don't get it then what you have is a reading comprehension issue.

> Typescript is what I choose for most tasks these days.

So you're smart on this, at least. Cantrill said it really well, Typescript brought "fresh water" to Javascript.

> haven’t noticed any real difference between my past C#, C++, C, Java, Ruby, etc programming peers and my current JavaScript ones.

You might still be on their level. I see that you didn't mention Rust or at least GoLang. Given the totality of your responses, you're certainly not writing any safe C (not ever).