Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC

2026-01-2713:13324153emsh.cat

2026-01-27 Just for the fun of it, I thought I'd embark on a week-long quest to generate millions of tokens and millions of lines of source code to create one basic browser that can render HTML and…

Show article

2026-01-27

Just for the fun of it, I thought I'd embark on a week-long quest to generate millions of tokens and millions of lines of source code to create one basic browser that can render HTML and CSS (no JS tho), and hopefully I could use this to receive even more VC investments.

But then I remembered that I have something even better: a human brain! It is usually better than any machine at coordinating and thinking through things, so let's see if we can hack something together, one human brain and one LLM agent brain!

The above might look like a simple .webm video, but it's actually a highly sophisticated and advanced browser that was super hard to build, encoded as pixels in a video file! Wowzers.

Day 1 - Starting out

For extra fun when building this, I set these requirements for myself and the agent:

I have three days to build it
Not a single 3rd party Rust library/dependency allowed
Allowed to use anything (commonly) provided out of the box on the OS it runs on
Should run on Windows, macOS and common Linux distributions
Should be able to render some websites, most importantly, my own blog and Hacker News, should be easy right?
The codebase can always compile and be built
The codebase should be readable by a human, although code quality isn't the top concern

So with these things in mind, I set out on the journal to build a browser "from scratch". I started with something really based, being able to just render "Hello World". Then to be able to render some nested tags. Added the ability of taking screenshots so the agent could use that. Added specifications for HTML/CSS (which I think the agent never used :| ), and tried to nail down the requirements for the agent to use. Also started doing "regression" or "E2E" tests with the screenshotting feature, so we could compare to some baseline images and so on. Added the ability to click on links just for the fun of it.

After about a day together with Codex, I had something that could via X11 and cURL, fetch and render websites when run, and the Cargo.lock is empty. It was about 7500 lines long in total at that point, split across files with all of them under 1000 lines long (which was a stated requirement, so not a surprise).

Day 2 - Moving On

Second day I got annoyed by the tests spawning windows while I was doing other stuff, so added a --headless flag too. Did some fixes for resizing the window, various compatibility fixes, some performance issues and improved the font/text rendering a bunch. Workflow was basically to pick a website, share a screenshot of the website without JavaScript, ask Codex to replicate it following our instructions. Most of the time was the agent doing work by itself, and me checking in when it notifies me it was done.

Day 3 - Polish & Cross-platform (+ day 4)

Third day we made large changes, lots of new features and a bunch of new features supported. More regression tests, fixing performance issues, fixing crashes and whatnot. Also added scrolling because this is a mother fucking browser, it has to be able to scroll. Added some debug logs too because that'll look cool in the demonstration video above, and also added support for the back button because it was annoying to start from scratch if I clicked the wrong link while testing.

At the end of the third day we also added starting support for macOS, and managed to get a window to open, and the tests to pass. Seems to work OK :) Once we had that working, we also added Windows support, basically the same process, just another platform after all.

Then the fourth day (whaaaat?) was basically polish, fixing CI for all three platforms, making it pass and finally cutting a release based on what got built in CI. Still all within 72 hours (3 days * 24 hours, which obviously this is how you count days).

The results after ~3 days (~70 hours)

And here it is, in all its glory, made in ~20K lines of code and under 72 hours of total elapsed time from first commit to last:

You could try compiling it yourself (zero Rust dependencies, so it's really fast :) ), or you can find binaries built on CI here:
https://github.com/embedding-shapes/one-agent-one-browser/releases

You can clone the repository, build it and try it out for yourself. It's not great, I wouldn't even say it's good, but it works, and demonstrates that one person with one agent can build a browser from scratch.

This is what the "lines of code" count ended up being after all was said and done, including support for three OSes:

$ git rev-parse HEAD
e2556016a5aa504ecafd5577c1366854ffd0e280

$ cloc src --by-file
      72 text files.
      72 unique files.
       0 files ignored.

github.com/AlDanial/cloc v 2.06  T=0.06 s (1172.5 files/s, 373824.0 lines/s)
-----------------------------------------------------------------------------------
File                                            blank        comment           code
-----------------------------------------------------------------------------------
src/layout/flex.rs                                 96              0            994
src/layout/inline.rs                               85              0            933
src/layout/mod.rs                                  82              0            910
src/browser.rs                                     78              0            867
src/platform/macos/painter.rs                      96              0            765
src/platform/x11/cairo.rs                          77              0            713
src/platform/windows/painter.rs                    88              0            689
src/bin/render-test.rs                             87              0            666
src/style/builder.rs                               83              0            663
src/platform/windows/d2d.rs                        53              0            595
src/platform/windows/windowed.rs                   72              0            591
src/style/declarations.rs                          18              0            547
src/image.rs                                       81              0            533
src/platform/macos/windowed.rs                     80              2            519
src/net/winhttp.rs                                 61              2            500
src/platform/x11/mod.rs                            56              2            487
src/css.rs                                        103            346            423
src/html.rs                                        58              0            413
src/platform/x11/painter.rs                        48              0            407
src/platform/x11/scale.rs                          57              3            346
src/layout/table.rs                                39              1            340
src/platform/x11/xft.rs                            35              0            338
src/style/parse.rs                                 34              0            311
src/win/wic.rs                                     39              8            305
src/style/mod.rs                                   26              0            292
src/style/computer.rs                              35              0            279
src/platform/x11/xlib.rs                           32              0            278
src/layout/floats.rs                               31              0            265
src/resources.rs                                   36              0            238
src/css_media.rs                                   36              1            232
src/debug.rs                                       32              0            227
src/platform/windows/dwrite.rs                     20              0            222
src/render.rs                                      18              0            196
src/style/custom_properties.rs                     34              0            186
src/platform/windows/scale.rs                      28              0            184
src/url.rs                                         32              0            173
src/layout/helpers.rs                              12              0            172
src/net/curl.rs                                    31              0            171
src/platform/macos/svg.rs                          35              0            171
src/browser/url_loader.rs                          17              0            166
src/platform/windows/gdi.rs                        17              0            165
src/platform/windows/scaled.rs                     16              0            159
src/platform/macos/scaled.rs                       16              0            158
src/layout/svg_xml.rs                               9              0            152
src/win/com.rs                                     26              0            152
src/png.rs                                         27              0            146
src/layout/replaced.rs                             15              0            131
src/net/pool.rs                                    18              0            129
src/platform/macos/scale.rs                        17              0            124
src/style/selectors.rs                             18              0            123
src/style/length.rs                                17              0            121
src/cli.rs                                         15              0            112
src/platform/windows/headless.rs                   20              0            112
src/platform/macos/headless.rs                     19              0            109
src/bin/fetch-resource.rs                          14              0            101
src/geom.rs                                        10              0            101
src/browser/render_helpers.rs                      11              0            100
src/dom.rs                                         11              0            100
src/style/background.rs                            15              0            100
src/layout/tests.rs                                 7              0             85
src/platform/windows/d3d11.rs                      14              0             83
src/win/stream.rs                                  10              0             63
src/platform/windows/svg.rs                        13              0             54
src/main.rs                                         4              0             33
src/platform/mod.rs                                 6              0             28
src/app.rs                                          5              0             25
src/lib.rs                                          1              0             20
src/platform/windows/mod.rs                         2              0             19
src/net/mod.rs                                      4              0             16
src/platform/macos/mod.rs                           2              0             14
src/platform/windows/wstr.rs                        0              0              5
src/win/mod.rs                                      0              0              3
-----------------------------------------------------------------------------------
SUM:                                             2440            365          20150
-----------------------------------------------------------------------------------

Takeaways

One human using one agent seems far more effective than one human using thousands of agents
One agent can work on a single codebase for hours, making real progress on ambitious projects
This could probably scale to multiple humans too, each equipped with their own agent, imagine what we could achieve!
Sometimes slower is faster and also better
The human who drives the agent might matter more than how the agents work and are set up, the judge is still out on this one

If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.

Versions

2026-01-27 7916ba2 I'm apparently very bad at spelling, luckily :set spell exists

@@ -8 +8 @@ date: 2026-01-27
 Just for the fun of it, I thought I'd embark on a 
-week long
+week-long
 quest to generate millions of tokens and millions 
+of
 lines of source code to create one basic browser that can render HTML and CSS (no JS tho), and hopefully I could use this to receive even more VC investments.
@@ -10 +10 @@ Just for the fun of it, I thought I'd embark on a week long quest to generate mi
 But then I remembered that I have something even better: a human brain! It is usually better than any machine at coordinating and thinking through things, so 
-lets
+let's
 see if we can hack something together, one human brain and one LLM agent brain!
@@ -14 +14 @@ But then I remembered that I have something even better: a human brain! It is us
 The above might look like a simple .webm video, but it's actually a highly sophisticated and advanced browser that was super hard to build, encoded as pixels in a video file! 
-Wowzsers.
+Wowzers.
@@ -28 +28 @@ For extra fun when building this, I set these requirements for myself and the ag
 So with these things in mind, I set out on the journal to build a browser "from scratch". I started with something really based, being able to just render "Hello World". Then to be able to render some nested tags. Added the ability of taking screenshots so the agent could use that. Added specifications for HTML/CSS (which I think the agent never used :| ), and tried to nail down the 
-requrements
+requirements
 for the agent to use. Also started doing "regression" or "E2E" tests with the screenshotting feature, so we could compare to some baseline images and so on. Added the ability to click on links
-to
 just for the fun of it.
@@ -30 +30 @@ So with these things in mind, I set out on the journal to build a browser "from
 After about a day together with Codex, I had something that could via X11 and cURL, fetch and render websites when run, and the Cargo.lock is empty. 
-It's
+It
 was about 7500 lines long in total at that point, split across files with all of them under 1000 lines long (which was a stated requirement, so not a surprise).
@@ -34 +34 @@ After about a day together with Codex, I had something that could via X11 and cU
 Second day I got 
-annoying
+annoyed
 by the tests spawning windows while I was doing other stuff, so added a --headless flag too. Did some fixes for resizing the window, various 
-compability
+compatibility
 fixes, some performance issues and improved the font/text rendering a bunch. Workflow was basically to pick a website, share a screenshot of the website without JavaScript, ask 
-codex
+Codex
 to replicate it following our instructions. Most of the time was the agent doing work by itself, and me checking in when it notifies me it was done.
@@ -38 +38 @@ Second day I got annoying by the tests spawning windows while I was doing other
 Third day we made large changes, lots of new features and a bunch of new features supported. More regression tests, fixing performance issues, fixing crashes and 
-what not.
+whatnot.
 Also added scrolling because this is a mother fucking browser, it has to be able to scroll. Added some debug logs too because that'll look cool in the demonstration video above, and also added support for the back button because it was annoying to start from scratch if I clicked the wrong link while testing.
@@ -46 +46 @@ Then the fourth day (whaaaat?) was basically polish, fixing CI for all three pla
 And here it is, in all 
-it's
+its
 glory, made in ~20K lines of code and under 72 hours of total elapsed time from first commit to last:
@@ -51,0 +52 @@ And here it is, in all it's glory, made in ~20K lines of code and under 72 hours
+You can clone the repository, build it and try it out for yourself. It's not great, I wouldn't even say it's good, but it works, and demonstrates that one person with one agent can build a browser from scratch.
@@ -53,3 +54 @@ And here it is, in all it's glory, made in ~20K lines of code and under 72 hours
-You can clone the repository, build it and try it out for yourself. It's not great, I wouldn't even say it's good, but it works, and demonstrates that one person with one agent, can build a browser from scratch.
 This is what the "lines of code" count ended up being after all was said and done, including support 
+for
 three OSes:
@@ -151 +150 @@ SUM: 2440 365 20
 - This could probably scale to multiple humans too, each 
-equiped
+equipped
 with their own agent, imagine what we could achieve!
@@ -153 +152 @@ SUM: 2440 365 20
 - The human who drives the agent might matter more than how the agents work and are 
-setup,
+set up,
 the judge is still out on this one
@@ -155 +154 @@ SUM: 2440 365 20
 If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.

2026-01-27 b819707 Touchups + headers

@@ -15,0 +16,2 @@ The above might look like a simple .webm video, but it's actually a highly sophi
+## Day 1 - Starting out
@@ -29,0 +32,2 @@ After about a day together with Codex, I had something that could via X11 and cU
+## Day 2 - Moving On
@@ -32 +36,3 @@ Second day I got annoying by the tests spawning windows while I was doing other
+## Day 3 - Polish & Cross-platform (+ day 4)
 Third day we made large changes, lots of new features and a bunch of new features supported. More regression tests, fixing performance issues, fixing crashes and what not. Also added scrolling because this is a mother fucking browser, it has to be able to scroll. Added some debug logs too because that'll look cool in the demonstration video 
-below,
+above,
 and also added support for the back 
-button.
+button because it was annoying to start from scratch if I clicked the wrong link while testing.
@@ -34 +40 @@ Third day we made large changes, lots of new features and a bunch of new feature
 At the end of the third day we also added 
+starting
 support for 
-macOS finally,
+macOS,
 and managed to get a window to open, and the tests to pass. Seems to work OK :) Once we had that working, we also added Windows support, basically the same process, just another platform after all.
@@ -36 +42 @@ At the end of the third day we also added support for macOS finally, and managed
 Then the fourth day (whaaaat?) was basically polish, fixing CI for all three platforms, making it pass and finally cutting a release based on what got built in CI. 
+Still all within 72 hours (3 days * 24 hours, which obviously this is how you count days).

2026-01-27 bc26dcb Add blogpost + video about "one agent one browser"

@@ -0,0 +1,149 @@
+---
+title: One Human + One Agent = One Browser From Scratch
+date: 2026-01-27
+---
+
+# One Human + One Agent = One Browser From Scratch
+
+Just for the fun of it, I thought I'd embark on a week long quest to generate millions of tokens and millions lines of source code to create one basic browser that can render HTML and CSS (no JS tho), and hopefully I could use this to receive even more VC investments.
+
+But then I remembered that I have something even better: a human brain! It is usually better than any machine at coordinating and thinking through things, so lets see if we can hack something together, one human brain and one LLM agent brain!
+
+![Demonstration of one-agent-one-browser running with a bunch of different websites on Linux/X11](/content/one-human-one-agent-one-browser.webm)
+
+The above might look like a simple .webm video, but it's actually a highly sophisticated and advanced browser that was super hard to build, encoded as pixels in a video file! Wowzsers.
+
+For extra fun when building this, I set these requirements for myself and the agent:
+
+- I have three days to build it
+- Not a single 3rd party Rust library/dependency allowed
+- Allowed to use anything (commonly) provided out of the box on the OS it runs on
+- Should run on Windows, macOS and common Linux distributions
+- Should be able to render some websites, most importantly, my own blog and Hacker News, should be easy right?
+- The codebase can always compile and be built
+- The codebase should be readable by a human, although code quality isn't the top concern
+
+So with these things in mind, I set out on the journal to build a browser "from scratch". I started with something really based, being able to just render "Hello World". Then to be able to render some nested tags. Added the ability of taking screenshots so the agent could use that. Added specifications for HTML/CSS (which I think the agent never used :| ), and tried to nail down the requrements for the agent to use. Also started doing "regression" or "E2E" tests with the screenshotting feature, so we could compare to some baseline images and so on. Added the ability to click on links to just for the fun of it.
+
+After about a day together with Codex, I had something that could via X11 and cURL, fetch and render websites when run, and the Cargo.lock is empty. It's was about 7500 lines long in total at that point, split across files with all of them under 1000 lines long (which was a stated requirement, so not a surprise).
+
+Second day I got annoying by the tests spawning windows while I was doing other stuff, so added a --headless flag too. Did some fixes for resizing the window, various compability fixes, some performance issues and improved the font/text rendering a bunch. Workflow was basically to pick a website, share a screenshot of the website without JavaScript, ask codex to replicate it following our instructions. Most of the time was the agent doing work by itself, and me checking in when it notifies me it was done.
+
+Third day we made large changes, lots of new features and a bunch of new features supported. More regression tests, fixing performance issues, fixing crashes and what not. Also added scrolling because this is a mother fucking browser, it has to be able to scroll. Added some debug logs too because that'll look cool in the demonstration video below, and also added support for the back button.
+
+At the end of the third day we also added support for macOS finally, and managed to get a window to open, and the tests to pass. Seems to work OK :) Once we had that working, we also added Windows support, basically the same process, just another platform after all.
+
+Then the fourth day (whaaaat?) was basically polish, fixing CI for all three platforms, making it pass and finally cutting a release based on what got built in CI.
+
+## The results after ~3 days (~70 hours)
+
+And here it is, in all it's glory, made in ~20K lines of code and under 72 hours of total elapsed time from first commit to last:
+
+[![Screenshot of one-agent-one-browser running on X11](/content/one-agent-one-browser-hn.png)](https://github.com/embedding-shapes/one-agent-one-browser)
+
+> You could try compiling it yourself (zero Rust dependencies, so it's really fast :) ), or you can find binaries built on CI here:<br/><small>[https://github.com/embedding-shapes/one-agent-one-browser/releases](https://github.com/embedding-shapes/one-agent-one-browser/releases)</small>
+
+
+You can clone the repository, build it and try it out for yourself. It's not great, I wouldn't even say it's good, but it works, and demonstrates that one person with one agent, can build a browser from scratch.
+
+This is what the "lines of code" count ended up being after all was said and done, including support three OSes:
+
+```shell
+$ git rev-parse HEAD
+e2556016a5aa504ecafd5577c1366854ffd0e280
+
+$ cloc src --by-file
+ 72 text files.
+ 72 unique files.
+ 0 files ignored.
+
+github.com/AlDanial/cloc v 2.06 T=0.06 s (1172.5 files/s, 373824.0 lines/s)
+-----------------------------------------------------------------------------------
+File blank comment code
+-----------------------------------------------------------------------------------
+src/layout/flex.rs 96 0 994
+src/layout/inline.rs 85 0 933
+src/layout/mod.rs 82 0 910
+src/browser.rs 78 0 867
+src/platform/macos/painter.rs 96 0 765
+src/platform/x11/cairo.rs 77 0 713
+src/platform/windows/painter.rs 88 0 689
+src/bin/render-test.rs 87 0 666
+src/style/builder.rs 83 0 663
+src/platform/windows/d2d.rs 53 0 595
+src/platform/windows/windowed.rs 72 0 591
+src/style/declarations.rs 18 0 547
+src/image.rs 81 0 533
+src/platform/macos/windowed.rs 80 2 519
+src/net/winhttp.rs 61 2 500
+src/platform/x11/mod.rs 56 2 487
+src/css.rs 103 346 423
+src/html.rs 58 0 413
+src/platform/x11/painter.rs 48 0 407
+src/platform/x11/scale.rs 57 3 346
+src/layout/table.rs 39 1 340
+src/platform/x11/xft.rs 35 0 338
+src/style/parse.rs 34 0 311
+src/win/wic.rs 39 8 305
+src/style/mod.rs 26 0 292
+src/style/computer.rs 35 0 279
+src/platform/x11/xlib.rs 32 0 278
+src/layout/floats.rs 31 0 265
+src/resources.rs 36 0 238
+src/css_media.rs 36 1 232
+src/debug.rs 32 0 227
+src/platform/windows/dwrite.rs 20 0 222
+src/render.rs 18 0 196
+src/style/custom_properties.rs 34 0 186
+src/platform/windows/scale.rs 28 0 184
+src/url.rs 32 0 173
+src/layout/helpers.rs 12 0 172
+src/net/curl.rs 31 0 171
+src/platform/macos/svg.rs 35 0 171
+src/browser/url_loader.rs 17 0 166
+src/platform/windows/gdi.rs 17 0 165
+src/platform/windows/scaled.rs 16 0 159
+src/platform/macos/scaled.rs 16 0 158
+src/layout/svg_xml.rs 9 0 152
+src/win/com.rs 26 0 152
+src/png.rs 27 0 146
+src/layout/replaced.rs 15 0 131
+src/net/pool.rs 18 0 129
+src/platform/macos/scale.rs 17 0 124
+src/style/selectors.rs 18 0 123
+src/style/length.rs 17 0 121
+src/cli.rs 15 0 112
+src/platform/windows/headless.rs 20 0 112
+src/platform/macos/headless.rs 19 0 109
+src/bin/fetch-resource.rs 14 0 101
+src/geom.rs 10 0 101
+src/browser/render_helpers.rs 11 0 100
+src/dom.rs 11 0 100
+src/style/background.rs 15 0 100
+src/layout/tests.rs 7 0 85
+src/platform/windows/d3d11.rs 14 0 83
+src/win/stream.rs 10 0 63
+src/platform/windows/svg.rs 13 0 54
+src/main.rs 4 0 33
+src/platform/mod.rs 6 0 28
+src/app.rs 5 0 25
+src/lib.rs 1 0 20
+src/platform/windows/mod.rs 2 0 19
+src/net/mod.rs 4 0 16
+src/platform/macos/mod.rs 2 0 14
+src/platform/windows/wstr.rs 0 0 5
+src/win/mod.rs 0 0 3
+-----------------------------------------------------------------------------------
+SUM: 2440 365 20150
+-----------------------------------------------------------------------------------
+```
+
+## Takeaways
+
+- One human using one agent seems far more effective than one human using thousands of agents
+- One agent can work on a single codebase for hours, making real progress on ambitious projects
+- This could probably scale to multiple humans too, each equiped with their own agent, imagine what we could achieve!
+- Sometimes slower is faster and also better
+- The human who drives the agent might matter more than how the agents work and are setup, the judge is still out on this one
+
+If one person with one agent can produce equal or better results than "hundreds of agents for weeks", then the answer to the question: "Can we scale autonomous coding by throwing more agents at a problem?", probably has a more pessimistic answer than some expected.

Read the original article

Comments

By simonw 2026-01-2715:484 reply

This is a notably better demonstration of a coding agent generated browser than Cursor's FastRender - it's a fraction of the size (20,000 lines of Rust compared to ~1.6m), uses way fewer dependencies (just system libraries for rendering images and text) and the code is actually quite readable - here's the flexbox implementation, for example: https://github.com/embedding-shapes/one-agent-one-browser/bl...

Here's my own screenshot of it rendering my blog - https://bsky.app/profile/simonwillison.net/post/3mdg2oo6bms2... - it handles the layout and CSS gradiants really well, renders the SVG feed icon but fails to render a PNG image.

I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!

By g947o 2026-01-2722:121 reply

I think most people would agree that this is much more superior than Cursor's "browser" from an engineering perspective -- it doesn't do much but does it well, as you pointed out.

What it tells me is that "effectively using agents" can be much more important than just throwing tokens at a problem and see what comes out. I myself have completely deleted several small vibe-coded projects without even going over the code, because what often happens is that, two days after the code is generated, I realize that I was solving the wrong problem or using the wrong approach.

A coding agent doesn't care. It most likely just does whatever you ask it to do with no pushback. While in some cases it's worth using them to validate an idea, often you dig a deeper hole for yourself if you go down a wrong path in the first place.

By embedding-shape 2026-01-2722:161 reply

Yeah, I agree with all of what you wrote, how these are used seems (to me) to be more important than how they're built. If you don't know software engineering, a software engineering agent isn't suddenly gonna make you one, but someone who already knows the craft, can be very effective with one.

Amplifiers, rather than replacements. I think the community at large still thinks LLMs and agents are gonna be "replacing" knowledge, which I think is far from the truth.

By menaerus 2026-01-2811:592 reply

I built a moderately complex and very good looking website in ~2 hours with the coding agent. Next step would be to write a backend+storage, and given how well the agent performs in these type of tasks, I assume I will be able to do that in the manner of hours too. I have never ever touched any of the technology involving the web development so, in my case, I can say that I no more need a full-stack dev that in normal circumstances I would definitely do. And the cost is ridiculous - few hours invested + $20 subscription.

I agree however on the point that no prior software engineering skills would make this much more difficult.

By embedding-shape 2026-01-2812:091 reply

Yeah, I don't doubt you, it's really effective at knocking out "simple" projects, I've had success vibe-coding for days, but eventually unless you have some reins on the architecture/design, it falls down over it's own slop, and it's very noticeable as the agent spends more and more time trying to work in the changes, but it's unable to.

So the first day or two, each change takes 20-30 minutes. Next day it takes 30-40 minutes per change, next day up to an hour and so on, as the requirements start to interact with each other, together with the ball of spaghetti they've composed and are now trying to change without breaking other parts.

Contrast that with when you really own the code and design, then you can keep going for weeks, all changes take 20-30 minutes, as at day one. But also means I'm paying attention to what's going on, so no vibe-coding, but pair programming with LLMs, and also requires you to understand both the domain, what you're actually aiming for and the basics of design/architecture.

By menaerus 2026-01-2814:081 reply

The point was not in simplicity but rather in if AI is replacing some people's jobs. I say that it certainly is, as given by the example, but I also acknowledge that the technology is still not at the point where human engineers are no more required in the loop.

I built other things too which would not be considered trivial or "simple", or as you say they're architecturally complex, and they involve very domain specific knowledge about programming languages, compilers, ASTs, databases, high-performance optimizations, etc. And for a long time, or shall I say never, have I felt this productive tbh. If I were to setup a company around this, which I believe I could, in pre-LLM era I'd quite literally have to hire 3-5 experienced engineers with sufficient domain expertise to build this together with me - and I mean not in every possible potential but the concrete work I've done in ~2 weeks.

By Imustaskforhelp 2026-01-2817:24

> The point was not in simplicity but rather in if AI is replacing some people's jobs. I say that it certainly is, as given by the example, but I also acknowledge that the technology is still not at the point where human engineers are no more required in the loop.

I feel like you have missed emsh's point which is that AI agents significantly become muddled up if your project's complex.

I feel the same way personally. If I don't know how the AI code interacts with each other, I feel a frustration as long as the project continues precisely because of the fact that they mention about first taking less time and then taking longer and longer time having errors which it missed etc.

I personally vibe code projects too but I will admit that there is this error.

I have this feeling that anything really complex will fall heels first if complexity really grows a lot or you don't unclog the slop.

This is also why we are seeing "AI slop janitors" humans whose task is to unsloppify the slop.

Personally I have this intution that AI will create really good small products, there is no denying in that, but those were already un-monetizable or if they were, then even in the past, they were really easy to replicate, this probably just lowered the friction

Now if your project is osmething commercial and large, I don't know how much AI slop can people trust. At some point if people depend on your project which is having these issues because people can understand if the project's AI generated or not, then that would have it issues too.

And I am speaking this from experience after building something like whmcs in golang in AI. At first, I am surprised and I feel as if its good enough for my own personal use case (gvisor) and maybe some really small providers. But when I want it to say hook to proxmox, have the tmate server be connected with an api to allow re-opening easier, have the idea of live migration from one box to another etc., create drivers for the custom firecrackers-ssh idea that I implemented once again using AI.

One can realize how quickly complexity adds in projects and how as emsh's points out that it becomes exponentially harder to use AI.

By queenkjuul 2026-01-2818:041 reply

Nobody ever needed a full stack dev to build a website

By menaerus 2026-01-296:261 reply

WDYM? Website is a frontend, server handling is a backend. How is that not a fullstack?

By apothegm 2026-02-015:25

Purely server rendered HTML can be a website. Static HTML pages with a server doing no more than S3 does can be a website. Websites existed long before SPAs were a twinkle in anyone’s eye.

By vidarh 2026-01-2716:532 reply

I think the human + agent thing absolutely will make a huge difference. I see regularly that Claude can totally off piste and eventually claw itself back with a proper agent setup but it will take a lot of time if I don't spot it and get it back on track.

I have one project Claude is working on right now where I'm testing a setup to attempt to take myself more out of the loop, because that is the hard part. It's "easy" to get an agent to multiply your output. It's hard to make that scale with your willingness to spend on tokens rather than with your ability to read and review and direct.

I've ended up with roughly this (it's nothing particularly special):

- Runs a evaluator that evaluates the current state and assigns scores across multiple metrics.

- If a given score is above a given threshold, expand the test suite automatically.

- If the score is below a given threshold, spawn a "research agent" that investgates why the scores don't meet expectations.

- The research agent delivers a report, that is passed to an implementation agent.

- The main agent re-runs the scoring, and if it doesn't show an improvement on one or more of the metrics, the commit is discarded, and notes made of what was tried, and why it failed.

It takes a bit of trial and error to get it right (e.g. "it's the test suite that is wrong" came up early, and the main agent was almost talked into revising the test suite to remove the "problematic" tests) but a division sort of like this lets Claude do more sensible stuff for me. Throwing away commits feels drastic - an option is to let it run a little cycle of commit -> evaluate -> redo a few times before the final judgement, maybe - but it so far it feels like it'll scale better. Less crap makes it into the project.

And I think this will work better than to treat these agents as if they are developers whose output costs 100x as much.

Code so cheap it is disposable should change the workflows.

So while I agree this is a better demonstration of a good way to build a browser, it's a less interesting demonstration as well. Now that we've seen people show that something like FastRender is possible, expect people to experiment with similarly ambitious projects but with more thought put into scoring/evaluation, including on code size and dependencies.

By embedding-shape 2026-01-2717:041 reply

> I think the human + agent thing absolutely will make a huge difference.

Just the day(s) before, I was thinking about this too, and I think what will make the biggest difference is humans who posses "Good Taste". I wrote a bunch about it here: https://emsh.cat/good-taste/

I think the ending is most apt, and where I think we're going wrong right now:

> I feel like we're building the wrong things. The whole vibe right now is "replace the human part" instead of "make better tools for the human part". I don't want a machine that replaces my taste, I want tools that help me use my taste better; see the cut faster, compare directions, compare architectural choices, find where I've missed things, catch when we're going into generics, and help me make sharper intentional choices.

By vidarh 2026-01-2717:251 reply

For some projects, "better tools for the human part" is sufficient and awesome.

But for other projects, being able to scale with little or no human involvement suddenly turns some things that were borderline profitable or not possible to make profitable at all with current salaries vs. token costs into viable businesses.

Where it works, it's a paradigm shift - for both good and bad.

So it depends what you're trying to solve for. I have projects in both categories.

By embedding-shape 2026-01-2717:281 reply

Personally I think the part where you try to eliminate humans from involvement, is gonna lead to too much trouble, being too inflexible and the results will be bad. It's what I've seen so far, haven't seen anything pointing to it being feasible, but I'd be happy to be corrected.

By vidarh 2026-01-2717:36

It really depends on the type of tasks. There are many tasks LLMs do for me entirely autonomously already, because they do it well enough that it's no longer worth my time.

By queenkjuul 2026-01-2818:48

I'm confused, what did FastRender show is possible? That's cursor's agent-built browser right?

The one that people couldn't compile, and was largely a failed attempt to stitch together existing libraries?

By Imustaskforhelp 2026-01-2720:48

To me I really like how embedding shapes took things in his own hands and actually built it. It really proved a point at such a scale where I don't think any recent example can point to.

It's great to see hackernews be so core part of it haha.

> I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!

I do wonder if tech people from future/present are gonna witness this as a goliath vs david story. 20k 1 human 1 agent beats 5 million$ 1.6 millions loc browser changing how even the massive AI users/pioneers at the time thought about the use of AI

Looks like I have watched some documentaries recently but why do I feel like a documentary about this whole thing can be created in future.

But also, More and more I am feeling like AI is an absolute black box, nobody knows how to do things but we are all kind of doing experiments with it and seeing what sticks (like how we now have definitive proof that 1 human 1 agent > many agents no human in the loop)

And this is when we are 1 month in 2026, who knows what other experiments and proofs happen this year to find more about this black box, and about its usefulness or not.

Simon, it would be interesting if you could read the thread of predictions of 2026 thread in hn each month or quaterly to see how many people were wrong or right about AI as we figure out more things perhaps.

By rananajndjs 2026-01-2718:37

[dead]

By embedding-shape 2026-01-2713:443 reply

I set some rules for myself: three days of total time, no 3rd party Rust crates, allowed to use commonly available OS libraries, has to support X11/Windows/macOS and can render some websites.

After three days, I have it working with around 20K LOC, whereas ~14K is the browser engine itself + X11, then 6K is just Windows+macOS support.

Source code + CI built binaries are available here if you wanna try it out: https://github.com/embedding-shapes/one-agent-one-browser

By bhadass 2026-01-2721:161 reply

very impressive!

it's amazing how far we've come in 20 years. i was a (very minor) contributor to khtml/konqueror (before apple got involved w/ webkit) in the early 2000s, and back then it was such a labor intensive process to even create a halfway working engine. like, months of work just to get basic rendering somewhat correct on a very small portion of the web (which was obv much smaller)

in addition to agentic coding, i think for this specific task having css-spec/html-spec/web-platform-tests as machine readable test suites helps a LOT. the agent can actually validate against real specs.

back in the day, despite having gecko as an open source reference, in practice the "standards" were whatever IE was doing. so you'd spend weeks implementing something only to discover every site was coded for IE's quirks lmao. for all of their other faults, google/apple and other contributors helped bring in discipline to that.

By embedding-shape 2026-01-2721:251 reply

> i think for this specific task having css-spec/html-spec/web-platform-tests as machine readable test suites helps a LOT

You know, I placed the specs in the repository with that goal (even sneaked in a repo that needs compiling before being usable), but as far as I can see, the agent never actually peeked into that directory nor read anything from them in the end.

It'll be easier to see once I made all the agent sessions public, and I might be wrong (I didn't observe the agent at all times), but seems the agent never used though.

By bhadass 2026-01-2721:45

oh interesting, so it just... didn't use them? lol. i guess the model's training data already has enough web knowledge baked in that it could wing it. curious if explicitly prompting it to reference the specs would change the output quality or time to solution.

very excited to see the agentic sessions when you release them.. that kind of transparency is super valuable for the community. i can see "build a browser from scratch" becoming a popular challenge as people explore the limits of agentic coding and try to figure out best practices for workflows/prompting. like the new "build a ray tracer" or say nanogtp but for agents.

By chatmasta 2026-01-2720:301 reply

Did you use Claude code? How many tokens did you burn? What’d it cost? What model did you use?

By embedding-shape 2026-01-2720:405 reply

Codex, no idea about tokens, I'll upload the session data probably tomorrow so you could see exactly what was done. I pay ~200 EUR/month for the ChatGPT Pro plan, prorating days I guess it'll be ~19 EUR for three days. Model used for everything was gpt-5.2 with reasoning effort set to xhigh.

By forgotpwd16 2026-01-2721:13

>I'll upload the session data probably tomorrow so you could see exactly what was done.

That'll be dope. The tokens used (input,output,total) are actually saved within codex's jsonl files.

By storystarling 2026-01-2722:341 reply

That 19 EUR figure is basically subscription arbitrage. If you ran that volume through the API with xhigh reasoning the cost would be significantly higher. It doesn't seem scalable for non-interactive agents unless you can stay on the flat-rate consumer plan.

By embedding-shape 2026-01-2723:571 reply

Yeah, no way I'd do this if I paid per token. Next experiment will probably be local-only together with GPT-OSS-120b which according to my own benchmarks seems to still be the strongest local model I can run myself. It'll be even cheaper then (as long as we don't count the money it took to acquire the hardware).

By mercutio2 2026-01-282:481 reply

What toolchain are you going to use with the local model? I agree that’s a Strong model, but it’s so slow for be with large contexts I’ve stopped using it for coding.

By embedding-shape 2026-01-288:362 reply

I have my own agent harness, and the inference backend is vLLM.

By mercutio2 2026-01-2822:16

Can you tell me more about your agent harness? If it’s open source, I’d love to take it for a spin.

I would happily use local models if I could get them to perform, but they’re super slow if I bump their context window high, and I haven’t seen good orchestrators that keep context limited enough.

By storystarling 2026-01-2810:321 reply

Curious how you handle sharding and KV cache pressure for a 120b model. I guess you are doing tensor parallelism across consumer cards, or is it a unified memory setup?

By embedding-shape 2026-01-2810:491 reply

I don't, fits on my card with the full context, I think the native MXFP4 weights takes ~70GB of VRAM (out of 96GB available, RTX Pro 6000), so I still have room to spare to run GPT-OSS-20B alongside for smaller tasks too, and Wayland+Gnome :)

By storystarling 2026-01-2812:241 reply

I thought the RTX 6000 Ada was 48GB? If you have 96GB available that implies a dual setup, so you must be relying on tensor parallelism to shard the model weights across the pair.

By embedding-shape 2026-01-2812:40

RTX Pro 6000 - 96GB VRAM - Single card

By ASalazarMX 2026-01-2818:52

> I'll upload the session data probably tomorrow so you could see exactly what was done.

I've been very skeptical of the real usefulness of code assistants, much in part from my own experience. They work great for brand new code bases, but struggle with maintenance. Seeing your final result, I'm eager to see the process, specially the iteration.

By soiltype 2026-01-2721:10

Thank you in advance for that! I barely use AI to generate code so I feel pretty lost looking at projects like this.

By oneneptune 2026-01-2816:44

Thanks in advance, I can't wait to see your prompts and how you architected this...

By jacquesm 2026-01-2716:36

Those are excellent constraints.

By QuadmasterXLII 2026-01-2720:001 reply

The rendering is pretty chaotic when I tried it- not that far off from just the text in the html tags, in some size, color, and placement on the screen. This sounds like unfairness, but there is some motte-and-bailey where if you claim to be a browser, I get to evaluate on stuff like links being consistently blue and underlined ( as is, they are sometimes blue and sometimes underlined, without a clear pattern- if they were never formatted differently from standard text, I would just buy this as a feature not implemented yet). It may be that some of the rendering is not supported on windows- the back button certainly isn't. I guess if I want to make my criticism actually legitimate I should make a "one human and no agent browser" post that just regexes out stuff that looks like content and formats it at random. The binary I downloaded definitely overperforms at the hacker news homepage and simonw's blog.

By embedding-shape 2026-01-2721:311 reply

It's a really basic browser. It's made less as an independent thing, and more as a reply to https://cursor.com/blog/scaling-agents, so as long as it does more or less the same as theirs, but is less LOC, it does what I set out for it to do :)

> I get to evaluate on stuff like links being consistently blue and underlined

Yeah, this browser doesn't have a "default stylesheet" like a regular browser. Probably should have added that, but was mostly just curious about rendering the websites from the web, rather than using what browsers think the web should look like.

> It may be that some of the rendering is not supported on windows- the back button certainly isn't.

Hmm, on Windows 11 the back button should definitively work, tried that just last night. Are you perhaps on Windows 10? I have not tried that myself, should work but might be why.

By QuadmasterXLII 2026-01-2721:591 reply

It is both extraordinarily impressive in an absolute sense, and fairly disappointing specifically comparing my result on a a random smattering of other no-js websites, to the expectation I had from the simonw screenshot (which to be clear is not an expectation you had control over, as you are not simonw). I'm familiar with this pattern from all the rest of my trying frontier ML results!

Yep, I ran it on an old windows 10 VM I had puttering about.

I think it must have a default link styling somewhere, as some links are the classic blue that as far as I know I intentionally styled to be black- but this could be css spaghetti in tufte.css finally coming to haunt me.

By embedding-shape 2026-01-2722:06

> I'm familiar with this pattern from all the rest of my trying frontier ML results!

Well, that's how this browser came to be, because I felt something similar to with how Cursor presented their results :) So I guess we're in the same club, somehow.

And yeah, lots of websites render poorly, for obvious reasons, if it's better or worse than Cursor's I guess will be up to the public, I'm sure if I actually treated it as a professional project I could probably get it to work quite nicely rather than the abomination it currently is.