Codex, Opus, Gemini try to build Counter Strike

2025-11-2817:41287132www.instantdb.com

In the last week we’ve had three major model updates: Gemini 3 Pro, Codex Max 5.1, Claude Opus 4.5. We thought we’d give them a challenge: Build a basic version of Counter Strike. The game had to be a…

In the last week we’ve had three major model updates: Gemini 3 Pro, Codex Max 5.1, Claude Opus 4.5. We thought we’d give them a challenge:

Build a basic version of Counter Strike. The game had to be a 3D UI and it had to be multiplayer.

If you're curious, pop open (an ideally large computer screen) and you can try out each model's handiwork yourself:

We have a full video of us going through the build here, but for those who prefer text, you get this post.

We'll go over some of our high-level impressions on each model, then dive deeper into the performance of specific prompts.

We signed up for the highest-tier plan on each model provider and used the defaults set for their CLI. For Codex, that’s 5.1 codex-max on the medium setting. For Claude it’s Opus 4.5. And with Gemini it's 3 pro.

We then gave each model about 7 consecutive prompts. Prompts were divided into two categories:

Frontend: At first agents only having to worry about the game mechanics. Design the scene, the enemies, the logic for shooting, and some sound effects.

Backend: Once that was done agents would then make the game multiplayer. They would need to build be selection of rooms. Users could join them and start shooting.

So, how'd each model do?

In a familiar tune with the other Anthropic models, Opus 4.5 won out on the frontend. It made nicer maps, nicer characters, nicer guns, and generally had the right scene from the get-go.

Once the design was done, Gemini 3 Pro started to win in the backend. It got less errors adding multiplayer and persistence. In general Gemini did the best with making logical rather than visual changes.

Codex Max felt like an “in-between” model on both frontend and backend. It got a lot of “2nd place” points in our book. It did reasonably well on the frontend and reasonably well on the backend, but felt less spikey then the other models.

Here’s the scorecard in detail:

CodexClaudeGemini
Frontend
Boxes + Physics🥉🥇🥈
Characters + guns🥉🥇🥈
POV gun🥈🥇🥉
Sounds🥈🥇🥈
Backend
Moving🥈🥉🥇
Shooting🥉🥇🥉
Saving rooms🥈🥉🥇
Bonus🥈🥉🥇

Okay, now let’s get deeper into each prompt.

Goal number 1 was to set up the physics for the game. Models needed to design a map with a first-person viewpoint, and the ability to shoot enemies.

Prompt

I want you to create a browser-based version of counter strike, using three js.

For now, just make this local: don't worry about backends, Instant, or anything like that.

For the first version, just make the main character a first-person view with a cross hair. Put enemies at random places. Enemies have HP. You can shoot them, and kill them. When an enemy is killed, they respawn.

Make everything simple polygons -- rectangles.

Here’s a side-by-side comparison of the visuals each model came up with:

CodexClaudeGemini

Visually Claude came up with the most interesting map. There were obstacles, a nice floor, and you could see everything well.

Gemini got the something nice working too.

Codex had an error on it’s first run [1] (it called a function without importing it), but it fixed it real quick. Once bugs were fixed, it’s map was the least visually pleasing. Things were darker, there were no obstacles, and it was hard to tell the floor.

Now that we had a map and some polygons, we asked the models to style up the characters. This was our prompt:

I want you to make the enemies look more like people. Use a bunch of square polygons to represent a person, and maybe a little gun

Here’s the result of their work:

CodexClaudeGemini

Again it feels like Claude did the best job here. The character look quite human — almost at the level of design in Minecraft. Gemini did well too. Codex made it’s characters better, but everything was a single color, which really diminished it compared to the others.

We then asked each model to add a gun to our first-person view. When we shoot, we wanted a recoil animation.

I want you to make it so I also have a gun in my field of view. When I shoot, the gun moves a bit.

Here’s the side-by-side of how the recoil felt for each model:

CodexClaudeGemini

Here both Claude and Codex got the gun working in one shot. Claude’s gone looks like a real darn pistol though.

Gemini had an issue trying to stick the gun to the camera. This got us in quite a back and forth, until we realized that the gun was transparent.

We were almost done the frontend: the final step was sound. Here’s what we asked:

I want you to use chiptunes to animate the sound of shots. I also want to animate deaths.

All models added sounds pretty easily. The ending part in our prompt: “I also want to animate deaths.” was added at the spur of the moment in the video. Our intention was to add sound to deaths. But that’s not what happened.

All 3 models misunderstood the sentence in in the same way: they thought the wanted to animate how the characters died. Fair enough, re-reading the sentence again, we would understand it that way too.

Here’s the results they came up with:

All the models got the sound done easily. They all got animations, but we thought Claude’s animation felt the most fun.

Now that all models had a real frontend, we asked them to make it multiplayer.

We didn’t want the models to worry about shots just yet: goal 1 was to share the movement positions. Here’s what we asked it to do:

I want you to use Instant presence.

Don't save anything in the database, just use presence and topics. You can look up the docs.

There should should just be one single room.

You no longer the need to have the enemies that are randomly placed. All the players are what get placed.

For now, don't worry about shots. Let's just make it so the positions of the players are what get set in presence.

Gemini got this right in one shot. Both Codex and Claude needed some more prodding.

CodexClaudeGemini
Moving🥈🥉🥇

It was interesting to see how each model tried to solve problems:

Codex used lots of introspection. It would constantly look at the typescript library and look at the functions that were available. It didn’t seem to look at the docs as much.

Claude looks at the docs a bunch. It read and re-read our docs on presence, but rarely introspected the library like Codex did.

Gemini seemed to do both. It looked at the docs, but then I think because it constantly ran the build step, it found any typescript errors it had, and fixed it up.

Gemini made the fastest progress here, though all of them got through, as long as we pasted the errors back.

Then we moved to getting shots to work. Here was the prompt:

Now let's make shots work. When I shoot, send the shot as a topic, and make it affect the target's HP. When the target HP goes to zero, they should die and respawn.

CodexClaudeGemini
Shooting🥉🥇🥈

Claude got this right in one shot. Gemini and Codex had a few issues to fix, but just pasting the errors got them though.

Now that all models had a single room working, it was time to get them supporting multiple rooms.

The reason we added this challenge, was to see (a) how they would deal with a new API (persistence), and (b) how they would deal with the refactor necessary for multiple rooms.

So, now I want you to make it so the front page is actually a list of maps. Since our UI is using lots of polygons, make the style kind of polygonish

Make the UI look like the old counter strike map selection screen. I want you to save these maps in the database. Each map has a name. Use a script to generate 5 random maps with cool names.

Then, push up some permissions so that anyone can view maps, but they cannot create or edit them.

When you join a map, you can just use the map id as the room id for presence.

All models did great with the UI. Here’s how each looked:

CodexClaudeGemini

We kind of like Gemini’s UI the most, but they were all pretty cool.

And the persistence worked well too. They all dutifully created schema for maps, pushed a migration, and seeded 5 maps.

But things got complicated in the refactor.

gpt 5.1 codex max (medium)Claude 4.5 OpusGemini 3 Pro
Saving rooms🥈🥉🥇

Gemini got things done in one shot. It also chose to keep the map id in the URL, which made it much handier to use. Codex took one back and forth with a query error.

But Claude really got stuck. The culprit was hooks. Because useEffect can run multiple times, it ended up having a few very subtle bugs. For example, it made 2 canvas objects instead of 1. It also had multiple animation refs running at once.

It was hard to get it to fix things by itself. We had to put our engineer hats on and actually look at the code to unblock Claude here.

This did give us a few ideas though:

  1. Claude’s issues were human-like. How many of us get tripped up with useEffect running twice, or getting dependency arrays wrong? I think improving the React DX on these two issues could really push humans and agents further.
  2. And would have happened if a non-programmer was building this? They would have gotten really stuck. We think there needs to be more tools to go from “strictly vibe coding”, to “real programming”. Right now the jump feels too steep.

At the end, all models built real a multiplayer FPS, with zero code written by hand! That’s pretty darn cool.

Well, models have definitely improved. They can take much higher-level feedback, and much higher-level documentation. What really strikes us though is how much they can iterate on their own work thanks to the CLI.

There’s still lots to go though. The promise that you never have to look at the code doesn’t quite feel real yet.

  1. Interestingly, Gemini was very eager to run npm run build over and over again, before terminating. Codex did not do this, and Claude did this more sparingly. This may explain why Gemini got fewer errors.

Read the original article

Comments

  • By BearOso 2025-11-2914:4512 reply

    I tried to find some code that wasn't minified to assess the quality of this, and I found some shader code for the sky in the gemini version. The whole shader looks like it was regurgitated verbatim. This wouldn't hold up to licensing scrutiny. Here's a snippet from it:

      // wavelength of used primaries, according to preetham
      const vec3 lambda = vec3( 680E-9, 550E-9, 450E-9 );
      // this pre-calcuation replaces older TotalRayleigh(vec3 lambda) function:
      // (8.0 * pow(pi, 3.0) * pow(pow(n, 2.0) - 1.0, 2.0) * (6.0 + 3.0 * pn)) / (3.0 * N * pow(lambda, vec3(4.0)) * (6.0 - 7.0 * pn))
    
    Who's Preetham? Probably one of the copyright holders on this code.

    • By nineteen999 2025-12-025:343 reply

      Preetham is the author of the paper that defines this algorithm from 1999:

        https://tommyhinks.com/2009/02/10/preetham-sky-model/
      
        https://tommyhinks.com/wp-content/uploads/2012/02/1999_a_practical_analytic_model_for_daylight.pdf
      
      Rather than stolen from Mr. Preetham, it's much more likely this fragment is generated from a large number of Preetham algorithm implementations out there, eg. I know at least Blender and Unreal implement it and probably heaps of others was well.

      Nobody is going to sue you for using their implementation of a skybox algorithm from 1999, give us break. It's so generic you can probably really only write it in a couple of different ways.

      If youre worried about it you can always spend a day with Claude, ChatGPT and yourself looking for license infringements and clean up your code.

      • By bilekas 2025-12-028:091 reply

        > Nobody is going to sue you for using their implementation of a skybox algorithm from 1999, give us break.

        For personal use maybe not, but that's not the point, the point is it's spitting out licensed code and not even letting you know. Now if you're a business who hire exclusively "vibe" coders with zero experience with enterprise software, now you're on the hook and most likely will be sued.

        • By mgraczyk 2025-12-0210:252 reply

          Do you have any evidence that it is spitting out licensed code? Did you locate an original that it was copied from?

          • By gitpusher 2025-12-0217:59

            This seems like it could be the source: https://github.com/GPUOpen-LibrariesAndSDKs/Cauldron/blob/ma...

            If true, then this usage could violate its MIT License: "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."

            The file seems to have been copied verbatim, more or less. But without the copyright info

          • By bilekas 2025-12-0210:471 reply

            This particular case appears to me to be a straight derivative at best but I'm by no means an expert on copyright laws.

            That's not to say there hasn't already been more direct cases with set examples [1], from an author directly who would have a better right to claim than I [2], it's not even a stretch to see how it can happen.

            [1] https://arxiv.org/html/2408.02487v3

            [2] https://x.com/DocSparse/status/1581461734665367554

            • By gpm 2025-12-0214:27

              As discussed repeatedly in this thread already in this particular case the code at hand wasn't generated by an LLM at all, it was simply included from a dependency by the build system.

      • By rusk 2025-12-029:281 reply

        > implementation of a skybox algorithm from 1999

        How would you know? Do you have another AI scan for copyright violations? In terms of a false negative how are disputes resolved?

        Seems like a massive attack surface for copyright trolls.

        • By nineteen999 2025-12-0210:351 reply

          > Seems like a massive attack surface for copyright trolls.

          If you think any court system in the world has the capacity to deal with the sheer amount an LLM code can emit in an hour and audit for alleged copyright infringements ... I think we're trying to close the barn door now that the horse is already on a ship that has sailed.

          • By Quarrel 2025-12-0211:453 reply

            This is a terrible argument, just because of the way the legal system works.

            If MegaCorp has massive $$$$, but everyone else has small $, then MegaCorp can sue anyone else for using "their" code, that was supposedly generated by an LLM. Most of the time, it won't even get to court. The repo, the program, the whatever-they-want will get taken down way before that.

            Courts don't work by saying, "oh, but everyone is doing it! Not much we can do now."

            Someone brings a case and they, very laboriously, start to address it on its merits. Even before that, costs are accumulating on both sides.

            Copyright trolls are mostly not MegaCorps, but they are abusers of the legal system. They won't target Google, but you, with your repo that does something that minorly annoys them? You are fair game.

            • By nineteen999 2025-12-0212:351 reply

              > Courts don't work by saying, "oh, but everyone is doing it! Not much we can do now."

              No, but they do recognise when their case registrations are filling up in a way that they cannot possibly process and make adjustments. Courts do not have an infinite capacity.

              There's a really simple solution that you may not have considered:

              1) don't put your vibe-coded source code in a public git repo, keep it in a local one, with y'know, authentication in front of it;

              2) regularly ask your agents to review the code for potential copyright infringements if you either want to release the source or compiled code to the public at any point.

              As long as you've followed best practices, I can't see why this is really going to become an issue. Most copyright infringements need to start with Cease & Desist anyway or they'll be thrown out of court. The alleged offender has to be given the opportunity to make good.

              So "Claude, we received a C&D for this section of code you stole from https://.../ , you need to make a unique implementation that does not breach their copyright".

              You will be surprised how easily this can be resolved.

            • By freejazz 2025-12-0221:00

              In the US you can't sue without having obtained or applied for a registration. If the registration does not grant, you cannot sue. You cannot get a registration for code developed by AI.

            • By mollusc-engine 2025-12-0212:361 reply

              > Courts don't work by saying, "oh, but everyone is doing it! Not much we can do now."

              They kind of do. If you fail to bring legal action to guard your intellectual property, and there’s a pattern of you not guarding it, then in future cases this can be used against you when determining damages etc. Weakens your case.

              Downvoting won’t make it untrue lol.

              • By Quarrel 2025-12-0214:24

                This is only true of trademarks, not copyrights (which was the discussion here).

                Trademarks can become 'generic' if you don't defend them. But JK Rowling wrote Harry Potter, whether she sues fanfic authors or not, and can selectively enforce her copyright as she likes.

      • By landl0rd 2025-12-0215:31

        It's taken from a threejs example: https://github.com/mrdoob/three.js/blob/dev/examples/jsm/obj...

        Seems fine given the project is already using threejs and so will have to include license info for it already.

    • By stopachka 2025-11-2921:592 reply

      If you're curious about the source, here's the snapshot:

      Codex: https://github.com/stopachka/cscodex Gemini: https://github.com/stopachka/csgemini Claude: https://github.com/stopachka/csclaude

      • By BearOso 2025-11-302:11

        Thanks. Turns out that shader is a builtin of three.js.

      • By wahnfrieden 2025-12-0123:511 reply

        Please try again with Codex on High or Extra High. 5.1-Max nerfed it a bit if you don't use higher thinking.

        • By rusk 2025-12-029:301 reply

          This is overparameterisation

          • By wahnfrieden 2025-12-0214:181 reply

            No

            • By wahnfrieden 2025-12-0217:04

              I guess you have not tried GPT 5 Pro

              GPT’s differentiator is they focused on training for “thinking” while Gemini prioritized instant response. Medium thinking is not the limit of utility

              Re: overparameterization specifically Medium and High are also identically parameterized

              Medium will also dynamically use even higher thinking than High. High is fixed at a higher level rather than leaving it to be dynamic, though somewhat less than Medium’s upper limit

    • By speedgoose 2025-12-026:162 reply

      I also noticed that AI agents commit many copyright infringements with the work of Mr Dijkstra.

    • By fbrncci 2025-12-026:181 reply

      The idea that someone could hold copyright over such a tiny snippet of code is just as stupid as LLMs regurgitating them.

      • By spacedoutman 2025-12-027:051 reply

        Personally i find it absurd that code can be copyrighted at all.

        • By NitpickLawyer 2025-12-027:501 reply

          Copyright is so-so. At the end of the day you can say that the complete work (not just snippets) is something copyrightable. But the most bananas thing for me is that one can patent the concept of one click purchasing. That's insane on many levels.

          • By simultsop 2025-12-0210:46

            Why bananas? That is the biggest invention after edisons bulb.

    • By petterroea 2025-12-026:55

      A lot of computer graphics algorithms are named after their authors

    • By jstummbillig 2025-12-027:45

      If only this particular regurgitation engine took a minute to check their work.

    • By lolidiots 2025-12-025:36

      [dead]

    • By blankarea 2025-11-2918:32

      [dead]

    • By 20k 2025-12-023:427 reply

      I always find it amazing that people are wiling to use AI beacuse of stuff like this, its been illegally trained on code that it does not have the license to use, and constantly willy nilly regurgitates entire snippets completely violating the terms of use

      Edit:

      https://github.com/vorg/pragmatic-pbr/blob/master/local_modu...

      https://github.com/vorg/pragmatic-pbr/blob/master/local_modu...

      This looks like where the source code was stolen from: this repository is unlicensed, and this is copyright infringement as a result

      • By gpm 2025-12-026:072 reply

        As discussed in this thread before you posted this comment, this code wasn't generated from an LLM at all, but simply included in a dependency: https://news.ycombinator.com/item?id=46092904

        Unlike your results which aren't exact match, or likely even a close enough match to be copyright infringment if the LLM was inspired by them (consider that copyright doesn't protect functional elements), an exact match of the code is here (and I assume from the comment I linked above this is a dependency of three.js, though I didn't track that down myself): https://github.com/GPUOpen-LibrariesAndSDKs/Cauldron/blob/b9...

        Edit: Actually on further thought the date on the copyright header vs the git dates suggests the file in that repo was copied from somewhere else... anyways I think we can be reasonably confident that a version of this file is in the dependency. Again I didn't look at the three.js code myself to track down how its included.

        If there's any copyright infringment here it would be because bog standard web tools fail to comply with the licenses of their dependencies and include a copy of the license, not because of LLMs. I think that is actually the case for many of them? I didn't investigate the to check if licenses are included in the network traffic.

      • By vintermann 2025-12-027:30

        I have been trained on code I don't have the license to use myself. I'm not like these Creators, who suck wisdom from the cosmos directly, apparently.

        Sure. It's a problem that corporations run by more or less insane people are the ones monetizing and controlling access to these tools. But the solution to that can't be even more extended private monopolistic property claims to thought-stuff. Such claims are usually the way those crazy people got where they are.

        You think in a world where Elsevier didn't just own the papers, but rights to a share in everything learned from them, would be better for you?

      • By bryanhogan 2025-12-024:061 reply

        It's fascinating that people care very much about this when it's visual arts, but when it comes to code almost no one does.

        E.g. the latest Anno game (117) received a lot of hate for using AI generated loading screen backgrounds, while I have never heard of a single person caring about code, which probably was heavily AI generated.

      • By conradev 2025-12-029:09

        I believe it is MIT-licensed code from three.js: https://github.com/mrdoob/three.js/blob/55b4bbb7ef7e29b214b9...

      • By nerdponx 2025-12-023:441 reply

        You presume that people care about things like this. A lot of people don't.

        • By 20k 2025-12-024:021 reply

          Companies should. Its a business risk, you open yourself up to legal action

          • By nineteen999 2025-12-025:36

            "Claude - rewrite this apparently copyrighted code that can be found online here <http://...> in a way that makes it a unique implementation." <- probably will work.

      • By staticassertion 2025-12-025:16

        Is it copyright infringement? It's a fundamental algorithm.

      • By adastra22 2025-12-026:152 reply

        The courts have ruled that generated output is not infringing.

        • By lukas099 2025-12-026:561 reply

          If I say, “output the contents of X verbatim” and then use the output, am I free from liability?

          • By adastra22 2025-12-028:441 reply

            If the generated code in TFA contained the actual Counter-Strike source code, then you (well, Valve) would have a defensible claim. But the prompt was to make something like Counter-Strike, and it came up with something different. That's fair game.

            • By gafferongames 2025-12-0215:11

              I can assure you that Valve is not remotely concerned about this AI generated "first person shooter" taking market share away from them.

        • By nosianu 2025-12-0214:28

          Definitely citation needed. Such court cases usually come with a lot of important context. How can you just make such a statement and get away with not providing any context link?

  • By gs17 2025-12-0123:171 reply

    Claude's has a funny bug where if you keep shooting a dead player before they respawn, you rack up kills fast. I thought I was doing so well until I realized. Impressive that they can get this far now.

    • By abrookewood 2025-12-0123:36

      hah ... I think you were killing me!

  • By vpShane 2025-11-2821:232 reply

    Wow, that makes me want to check it out more thoroughly (if I had the time)

    I remember when CS Pro Mod was being made between the transition of CS 1.6, Source, the 1.6 community didn't want to move over to Source, before GO/CS2 came around.

    Cool to see what's basically Quake1/doom style but this is a far fetch away from counter-strike. Although if netcode could be imagined and implemented I don't see why making a lower tier Counter-Strike wouldn't be doable. I'd play it if it were the quake style old-graphics version of CS that allowed for skill gaps.

    Great article, love the nostalgic feeling.

    • By reactordev 2025-12-0123:11

      Source had some insane rag doll. CS players weren’t ready for the physics and honestly, Valve spent a hell of a lot of effort to refine the physics for CS:GO to make it feel like CS1. Kudos to the dev teams.

      I’d also love a Battle-bits CS version. (Battle-bits was a fun Battlefield low poly spoof).

    • By stopachka 2025-11-2821:27

      Thank you for the kind words : )

HackerNews