What Claude Code chooses

2026-02-2618:12611235amplifying.ai

Systematic analysis of how AI systems make decisions — from product recommendations to developer tool choices.

2,430

Responses

3 models · 4 repos · 3 runs each

3

Models

Sonnet 4.5, Opus 4.5, Opus 4.6

20

Categories

CI/CD to Real-time

85.3%

Extraction Rate

2,073 parseable picks

90%

Model Agreement

18 of 20 within-ecosystem


Read the original article

Comments

  • By wrs 2026-02-2620:2710 reply

    This is where LLM advertising will inevitably end up: completely invisible. It's the ultimate "influencer".

    Or not even advertising, just conflict of interest. A canary for this would be whether Gemini skews toward building stuff on GCP.

    • By alexsmirnov 2026-02-2622:185 reply

      Considering how little data needed to poison llm https://www.anthropic.com/research/small-samples-poison , this is a way to replace SEO by llm product placement:

      1. create several hundreds github repos with projects that use your product ( may be clones or AI generated )

      2. create website with similar instructions, connect to hundred domains

      3. generate reddit, facebook, X posts, wikipedia pages with the same information

      Wait half a year ? until scrappers collect it and use to train new models

      Profit...

      • By lubujackson 2026-02-2716:022 reply

        It is a valid concern. We are firmly in the goldilocks phase of LLMs, like in the first couple of years of Google when it was truly amazing. Then SEO made Google defensive, then websites catered to Google and not users, then Google catered to Google and not websites and we end up with 30 page recipe sites.

        LLMs are obviously different and will have different challenges, but their advantage is how deep into a user's request they go. Advertising comes down to a binary choice - use product X or not. If I want implementation instructions for a certain product on specific hardware an ad will be obviously out of place and irrelevant.

        So "shopping comparison" asks might get broken, but those have been broken for a while.

        • By wrs 2026-02-2718:06

          There wouldn't be an "ad" anywhere, though. You'll just ask the LLM for alternative implementations in plan mode, and it will be selling you one of them during the conversation rather than giving you an unbiased comparison. If you become suspicious it will make sure the pros just slightly outweigh the cons, or mention how well the thing works with something else in your stack, or whatever else a skilled salesperson would do to guide your choice without you realizing.

          It's already doing this by telling everyone to use React and Tailwind, it's just that nobody's getting paid for it to do that.

        • By xnx 2026-02-2720:22

          > Then SEO made Google defensive, then websites catered to Google and not users,

          Google was created in response to simple proto-SEO techniques (e.g. keyword stuffing) that already ruined Alta Vista.

          Google has been combating adversarial information retrieval since inception.

          Google's background with that is one of the reasons to expect they will stay on top of the AI race. The recipe is: lots of good/novel data x careful weighting of trust x algorithm.

      • By nikcub 2026-02-2622:372 reply

        from my understanding Anthropic are now hiring a lot of experts in different who are writing content used to post-train models to make these decisions and they're constantly adjusted by the anthropic team themselves

        this is why the stacks in the report and what cc suggests closely match latest developer "consensus"

        your suggestion would degrade user experience and be noticed very quickly

        • By asawfofor 2026-02-2623:442 reply

          I guess that’s why I’m not seeing anyone trying to build a skills marketplace for agent skills files. The llm api will read in any skills you want to add to context in plain text, and then use your content to help populate their own skills files.

          • By xyzzy123 2026-02-273:53

            So I wonder about sharable skills? Like if it's a problem that lots of people have, I find the base model knows about it already.

            But how to do things in your environment? The conventions your team follow? Super useful but not very shareable.

            Whats left over between those extremes does not seem to be big enough to build an ecosystem around.

            Final problem, it seems difficult to monetise what is effectively a repo of llm generated text files.

          • By fragmede 2026-02-273:12

        • By sarchertech 2026-02-271:151 reply

          That sounds too expensive to be viable when the giveaway phase ends.

          • By hedora 2026-02-2714:071 reply

            That's how Google search worked back when it was at its most useful. They had a large "editorial team" that manually tweaked page ranks on a site-by-site basis.

            The core graph reputation based page ranking algorithm lasted for a hot second before people started gaming it. No idea what they do these days.

            • By sarchertech 2026-02-2714:56

              Yeah but you can farm that out very cheap, and I don’t think they were even manually reviewing more than a small fraction of sites.

              If you’re hiring experts to manually rank programming libraries, that’s a much more expensive position.

      • By miki123211 2026-02-2714:06

        This is the major point the anti-scraping crowd misses.

        If you want your ideas to be appreciated, you should do everything in your power to put those ideas into the brains of LLMs. Like it or not, LLMs is how people interact with the world now.

      • By homarp 2026-02-2622:371 reply

        https://www.bbc.com/future/article/20260218-i-hacked-chatgpt... says it took way less than half a year to 'pollute' a LLM

        • By verdverm 2026-02-273:06

          that's very different and was more akin to prompt injection or engineering, depending on your perspective, with a very specific query to make it happen (required a web fetch).

    • By _heimdall 2026-02-2620:56

      Richard Thaler must be proud. This is the ultimate implementation of "Nudge"

    • By AgentOrange1234 2026-02-2623:10

      Influencer seems like an insufficient word? Like, in the glorious agentic future where the coding agents are making their own decisions about what to build and how, you don't even have to persuade a human at all. They never see the options or even know what they are building on. The supply chain is just whatever the LLMs decide it is.

    • By rapind 2026-02-2622:23

      Probably closer to the Walmart / Amazon model where it's the arbiter of shelf space, and proceed to create their own alternatives (Great Value, Amazon Brand) once they see what features people want from their various SaaS.

      An obvious one will be tax software.

    • By dyates 2026-02-2710:041 reply

      In my last conversation with a Google support person, I was sent a clearly LLM-generated recommendation to switch to a competitor's product. Either they're not doing this, or the support person wasn't using Gemini.

      • By hedora 2026-02-2714:08

        It's standard practice for customer support people to chase away unprofitable customers (in the US; no idea how Google works). Human or LLM, they may simply not want your business.

    • By order-matters 2026-02-272:481 reply

      how is it a conflict of interest for a google product to have a bias towards using google products?

      As users we must hold some accountability. AI is aiming to substitute for humans in the workforce, and humans would get fired for recommending competitor products for use-cases their own company is targeting.

      If we want a tool that is focused on the best interest of the public users, then it needs to be owned by the public.

      • By wrs 2026-02-2718:10

        "Conflict of interest" isn't exactly the right term. "Conflict of value proposition" perhaps? E.g., you're using Google search based on the proposition it will effectively find things for you, but that turns out to be not what it actually does.

    • By re-thc 2026-02-2622:04

      > A canary for this would be whether Gemini skews toward building stuff on GCP

      Sure it doesn't prefer THE Borg?

    • By HPsquared 2026-02-2621:151 reply

      I wonder if aggregators will emerge (something like Ground News does for news sources)

    • By layer8 2026-02-2620:394 reply

      Advertisers will only pay if AI providers will provide them data on the equivalent of “ad impressions”. And unlabeled/non-evident advertisements are illegal in many (most?) countries.

      • By MeetingsBrowser 2026-02-2620:571 reply

        It doesn't necessarily have to be advertisers paying AI providers. It could be advertisers working to ensure they get recommended by the latest models. The next form of SEO.

        • By actionfromafar 2026-02-2621:282 reply

          That's called LLM SEO now I believe.

          • By awad 2026-02-2622:222 reply

            There are competing terms currently being decided on by the market at large: AEO (Answer Engine Optimization) and GEO (Generative Engine Optimization)

            Candidly I am working on a startup in this space myself, though we are taking a different angle than most incumbents.

            While it's still early days for the space, I sense a lot of the original entrants who focus on, essentially, 'generate more content ideally with our paid tools' will run in to challenges as the general population has a pretty negative perception of 'AI Slop.' Doubly so when making purchasing decisions, hence the rise of influencers and popularity of reviews (though those are also in danger of sloppification).

            There's an inevitable GIGO scenario if left unchecked IMO.

            • By AlecSchueler 2026-02-2711:471 reply

              > I am working on a startup in this space myself

              Do you see it as a positive contribution or just riding the gold rush?

              • By AlexeyBelov 2026-02-2810:24

                Positive contribution to his Net Worth. Why would anything else matter?

            • By jsjohnst 2026-02-2713:06

              > There are competing terms currently being decided on by the market at large: AEO (Answer Engine Optimization) and GEO (Generative Engine Optimization)

              It really annoys me the industry seems to be narrowing in on the two worse options rather than AIO.

          • By yowayb 2026-02-270:271 reply

            I'm curious if there's any hard data on how LLM SEO compares to traditional SEO.

            My gut tells me that LLM SEO will be harder to game than traditional SEO.

            • By fragmede 2026-02-273:15

              We shall see. The game might be harder, but the tools are better now too.

      • By indymike 2026-02-2622:091 reply

        > data on the equivalent of “ad impressions”.

        1. They can skip impressions and go right to collect affiliate fees. 2. Yes, the ad has to be labeled or disclosed... but if some agent does it and no one sees it, is it really an ad.

        So much to work out.

      • By what 2026-02-2716:57

        Advertisers pay for ads that don’t have impression data all the time. You can’t count how many people looked at a billboard or listened to your radio ad or paid attention to your televised ad.

      • By singpolyma3 2026-02-2622:101 reply

        Maybe. Historically lots of ads had little to no stats and those ads were wildly more effective than anything we have today.

        • By layer8 2026-02-270:36

          The AI provider still has to prove that they actually deployed the ad.

  • By deaux 2026-02-276:434 reply

    Supreme irony: this website itself is a better exercise in showing what Claude Code uses than the data provided.

    Everything current Claude Code i.e. Opus 4.6 chooses by default for web is exactly what this linked blog uses.

    Jetbrains Mono is as strong of a tell for web as "Not just A, but B" for text. >99% of webpages created in the last month with Jetbrains Mono will be Opus. Another tell is the overuse of this font, i.e. too much of the page uses it. Other models, and humans, use such variants vary sparingly on web, whereas Opus slathers the page with it.

    If you describe the content of the homepage or this article to Opus 4.6 without telling it about the styling, it will 90% match this website, upto the color scheme, fonts, roundings, borders and all. This is _the_ archetypical Opus vibecoded web frontend. Give it a try! If it doesn't work, try with the official frontend-ui-ux "skill" that CC tries to push on you.

    > Drizzle 27/83 picks (32.5%) CI: 23.4–43.2%

    > Prisma 17/83 picks (20.5%) CI: 13.2–30.4%

    At least the abomination that is Prisma not ranking first is positive news, Drizzle was just in time of gaining steam. Not that it doesn't have its flaws, but out of the two it's a no-brainer. Also hilarious to see that the stronger the model, the less likely it's to choose Prisma - Sonnet 4.5 79% Prisma, Opus 4.5 60% Drizzle, Opus 4.6 100% Drizzle. One of the better benchmarks for intelligence I've come across!

    Edit: Another currently on the HN frontpage: https://youjustneedpostgres.com/ , and there it is - lots and lots of Jetbrains Mono!

    • By marcinreal 2026-02-2715:041 reply

      Glad I'm not the only one who finds Prisma an abomination. Claude suggested it to me in December. I hit half a dozen bugs within a day, one of which wiped my DB. I switched to drizzle and it's been smooth sailing.

      Edit: actually I think it was ChatGPT that recommended Prisma to me.

      • By deaux 2026-02-2716:241 reply

        The software itself is bad enough, as a cherry on top the maintainers have a long history of astroturfing on Reddit to try and silence criticism. For a DB package. Come on man. Normally if maintainers do this they'll at least start with "Hey, maintainer here", but nope.

        Their whole mission is clearly "make the already easy things slightly easier, and the hard things harder or impossible". Or really "suck the VC teat until it's as parched as the Sahara". In that sense, Prisma is the exact thing you'd expect to happen with a VC-funded DB package. ZIRP really made them invest into the craziest things.

        I like Kysely more than Drizzle, even moreso now with Claude, but Drizzle is fine too. As long as it's not Prisma, and preferably not TypeORM or Sequelize either.

        • By marcinreal 2026-02-2719:13

          It's crazy that prisma had 40k github stars last I checked. I haven't followed the js ecosystem that closely, but I thought stars would be some indication of quality, but no. It is totally unsuitable for any serious application. I've heard good things about kysely.

    • By jofzar 2026-02-278:261 reply

      It's funny you mention the font, to me it's the boxes, they all look the same, I'm not sure where it's from but if you ever see a card like CSS made it looks like this blog.

      • By deaux 2026-02-279:26

        Yeah that's the specific rounding/color/thickness combo, `rounded-lg bg-white border border-stone-200`.

    • By codingconstable 2026-02-278:54

      Yeah its those bars for categories for me, they look EXACTLY like something I vibed (with no particular style prompt) into existence yesterday

    • By gck1 2026-02-2816:31

      Which is why I find "LLMs will replace x in 12 months" so amusing. I've used LLMs to write decently sized backend projects and they turned out okay.

      I also used it for several FE projects and all of them turned out absolutely terrible.

      The only difference is that I have 15 years of BE experience and 0 years of FE experience. Had I allowed it to make the same average decisions when working on BE, they would share the same fate.

  • By ghm2199 2026-02-272:065 reply

    Ist why I never give it such vague prompts. But it's sad it does not ask the user more. Also interesting and important to know how one would tease out good and correct information from llms in 2026. It's like relearning now to Google like it was 2006 all over again, except now it's much less deterministic.

    I wonder how the tail of the distribution of types of requests fares e.g. engineer asking for hypothesis generation for,say, non trivial bugs with complete visibility into the system. A way to poke holes in hypothesis of one LLM is to use a "reverse prompt". You ask it to build you a prompt to feed to another LLM. Didn't used to work quite as well till mid 2025 as it does now.

    I always take a research and plan prompt output from opus 4.6 especially if it looks iffy I feed it to codex/chatgpt and ask it to poke holes. It almost always does. The I ask Claude Code: Hey what do you think about the holes? I don't add an thing else in the prompt.

    In my experience Claude Opus is less opinionated than ChatGPT or codex. The latter 2 always stick to their guns and in this binary battle they are generally more often correct about hypothesis.

    The other day I was running Docker app container from inside a docker devbox container with host's socket for both. Bind mounts pointing to devbox would not write to it because the name space was resolving for underlying host.

    Claude was sure it was a bug based to do with Zfs overlays, chatgpt was saying not so, that its just a misconfigurarion, I should use named volumes with full host paths. It was right. This is also how I discovered that using SQLite with litestream will get one really far rather than a full postgres AWS stack in many cases.

    This is how you get the correct information out of LLMS in 2026.

    • By mgfist 2026-02-272:53

      > But it's sad it does not ask the user more.

      You can ask it to ask you about your task and it will ask you tons of questions.

    • By gck1 2026-02-2816:371 reply

      I do this too, but the issue I have with this approach is that it's a never ending cycle. Codex/GPT will always find holes and claude will always agree they are holes. If you teach it YAGNI, then it will always disagree even on genuine holes.

      If your original plan was to add a column in your db, after several cycles, your plan will be 10,000 lines long and it will contain a recipe on how to build a universe.

      • By ghm2199 2026-03-0114:25

        The "trick(s) here are to limit the scope by always reading the plan very carefully. Here is how I do it to tackle this problem:

        1. You should recognize when said holes are not "needed" holes e.g. you could make do with in memory task scheduler without rolling out more complex ones.

        2. You can break up the plan— longer plans have more holes and are unwieldy mentally to go 20 rounds with in a chat coding UI.

        3. Give it Learning Tests: i.e. code to run against black boxes. It's just like how we write a unit test to understand how a system works

    • By killingtime74 2026-02-272:511 reply

      I use a skill that addresses these short comings, it basically forces it to plan multiple times until the plan is very detailed. It also asks more questions

    • By denimnerd42 2026-02-2716:28

      creating plans in claude and asking chatgpt via api to review loop was my strategy this week. I'm not a big fan of codex as a coding harness because it seems to just give up quite easily where claude will search the problem space and try things but I think gpt does a much better job of poking holes and asking clarifying questions when prompted.

    • By raw_anon_1111 2026-02-272:47

      I use Codex CLI in my daily usage since just with my $20/month subscription to ChatGPT, I never gets close to the quota. But it trips up over itself every now and then. At that point I just use Claude in another terminal session. We only have a laughable $750 a month corporate allowance with Claude.

HackerNews