Hacker News

Phoenix.new – Remote AI Runtime for Phoenix

2025-06-2014:57414187fly.io

Documentation and guides from the team at Fly.io.

Show article

I’m Chris McCord, the creator of Elixir’s Phoenix framework. For the past several months, I’ve been working on a skunkworks project at Fly.io, and it’s time to show it off.

I wanted LLM agents to work just as well with Elixir as they do with Python and JavaScript. Last December, in order to figure out what that was going to take, I started a little weekend project to find out how difficult it would be to build a coding agent in Elixir.

A few weeks later, I had it spitting out working Phoenix applications and driving a full in-browser IDE. I knew this wasn’t going to stay a weekend project.

If you follow me on Twitter, you’ve probably seen me teasing this work as it picked up steam. We’re at a point where we’re pretty serious about this thing, and so it’s time to make a formal introduction.

World, meet Phoenix.new, a batteries-included fully-online coding agent tailored to Elixir and Phoenix. I think it’s going to be the fastest way to build collaborative, real-time applications.

Let’s see it in action:

First, even though it runs entirely in your browser, Phoenix.new gives both you and your agent a root shell, in an ephemeral virtual machine (a Fly Machine) that gives our agent loop free rein to install things and run programs — without any risk of messing up your local machine. You don’t think about any of this; you just open up the VSCode interface, push the shell button, and there you are, on the isolated machine you share with the Phoenix.new agent.

Second, it’s an agent system I built specifically for Phoenix. Phoenix is about real-time collaborative applications, and Phoenix.new knows what that means. To that end, Phoenix.new includes, in both its UI and its agent tools, a full browser. The Phoenix.new agent uses that browser “headlessly” to check its own front-end changes and interact with the app. Because it’s a full browser, instead of trying to iterate on screenshots, the agent sees real page content and JavaScript state – with or without a human present.

Agents build software the way you did when you first got started, the way you still do today when you prototype things. They don’t carefully design Docker container layers and they don’t really do release cycles. An agent wants to pop a shell and get its fingernails dirty.

A fully isolated virtual machine means Phoenix.new’s fingernails can get arbitrarily dirty. If it wants to add a package to mix.exs, it can do that and then run mix phx.server or mix test and check the output. Sure. Every agent can do that. But if it wants to add an APT package to the base operating system, it can do that too, and make sure it worked. It owns the whole environment.

This offloads a huge amount of tedious, repetitive work.

At his startup school talk last week, Andrej Karpathy related his experience of building a restaurant menu visualizer, which takes camera pictures of text menus and transforms all the menu items into pictures. The code, which he vibe-coded with an LLM agent, was the easy part; he had it working in an afternoon. But getting the app online took him a whole week.

With Phoenix.new, I’m taking dead aim at this problem. The apps we produce live in the cloud from the minute they launch. They have private, shareable URLs (we detect anything the agent generates with a bound port and give it a preview URL underneath phx.run, with integrated port-forwarding), they integrate with Github, and they inherit all the infrastructure guardrails of Fly.io: hardware virtualization, WireGuard, and isolated networks.

Github’s gh CLI is installed by default. So the agent knows how to clone any repo, or browse issues, and you can even authorize it for internal repositories to get it working with your team’s existing projects and dependencies.

Full control of the environment also closes the loop between the agent and deployment. When Phoenix.new boots an app, it watches the logs, and tests the application. When an action triggers an error, Phoenix.new notices and gets to work.

Phoenix.new can interact with web applications the way users do: with a real browser.

The Phoenix.new environment includes a headless Chrome browser that our agent knows how to drive. Prompt it to add a front-end feature to your application, and it won’t just sketch the code out and make sure it compiles and lints. It’ll pull the app up itself and poke at the UI, simultaneously looking at the page content, Javascript state, and serverside logs.

Phoenix is all about “live” real-time interactivity, and gives us seamless live reload. The user interface for Phoenix.new itself includes a live preview of the app being worked on, so you can kick back and watch it build front-end features incrementally. Any other .phx.run tabs you have open also update as it goes. It’s wild.

Phoenix.new can already build real, full-stack applications with WebSockets, Phoenix’s Presence features, and real databases. I’m seeing it succeed at business and collaborative applications right now.

But there’s no fixed bound on the tasks you can reasonably ask it to accomplish. If you can do it with a shell and a browser, I want Phoenix.new to do it too. And it can do these tasks with or without you present.

For example: set a $DATABASE_URL and tell the agent about it. The agent knows enough to go explore it with psql, and it’ll propose apps based on the schemas it finds. It can model Ecto schemas off the database. And if MySQL is your thing, the agent will just apt install a MySQL client and go to town.

Frontier model LLMs have vast world knowledge. They generalize extremely well. On stage at ElixirConfEU, I did a demo vibe-coding Tetris on stage. Phoenix.new nailed it, first try, first prompt. It’s not like there’s gobs of Phoenix LiveView Tetris examples floating around the Internet! But lots of people have published Tetris code, and lots of people have written LiveView stuff, and 2025 LLMs can connect those dots.

At this point you might be wondering – can I just ask it to build a Rails app? Or an Expo React Native app? Or Svelte? Or Go?

Yes, you can.

Our system prompt is tuned for Phoenix today, but all languages you care about are already installed. We’re still figuring out where to take this, but adding new languages and frameworks definitely ranks highly in my plans.

We’re at a massive step-change in developer workflows.

Agents can do real work, today, with or without a human present. Buckle up: the future of development, at least in the common case, probably looks less like cracking open a shell and finding a file to edit, and more like popping into a CI environment with agents working away around the clock.

Local development isn’t going away. But there’s going to be a shift in where the majority of our iterations take place. I’m already using Phoenix.new to triage phoenix-core Github issues and pick problems to solve. I close my laptop, grab a cup of coffee, and wait for a PR to arrive — Phoenix.new knows how PRs work, too. We’re already here, and this space is just getting started.

This isn’t where I thought I’d end up when I started poking around. The Phoenix and LiveView journey was much the same. Something special was there and the projects took on a life of their own. I’m excited to share this work now, and see where it might take us. I can’t want to see what folks build.

Read the original article

wut42

Karma: 3370

@Hacker__News
@hacker._news

Comments

By travisgriggs 2025-06-212:50

I’m torn on “is this the future”.

I worked all day on a Phoenix app we’re developing for ag irrigation analysis. Of late, my “let’s see what $20/mo gets you” is Zed with its genetic offerings.

It actually writes very little Elixir code for me. Sometimes, I let it have a go, but mostly I end up rewriting that stuff. Elixir is fun, and using the programming model as intended is enlightening.

What I do direct it to write a lot is a huge amount of the HEEX stuff for me. With an eventual pass over and clean it up for me. I have not memorized all of the nuances of CSS and html. I do not want to. And writing it has got to be the worst syntactic experience in the history of programming. It’s like someone said people lisp was cool; rather than just gobs of nested parentheses, let’s double, nay triple, no quadruple down on that. We’ll bracket all our statements/elements with a PAIR of hard to type characters, and for funsies, we’ll make them out different words in there. And then when it came to years of how to express lists of things, it’s like someone said “gimme a little bit of ini, case insensitivity, etc”. And every year, we’ll publish new spec of new stuff that preserves the old while adding the new. I digress…

I view agentic coding as an indictment on how bad programming has gotten. I’m not saying there wouldn’t be value, but a huge amount of the appeal, is that web tech is like legalese filled with what are probably hidden bugs that are swallowed by browsers in a variety of u predictable ways. What a surprise that we’ve given up and decided the key tools do the probabilistic right thing. It’s not like we had a better chance of being any more correctly precise on our own anyway.

By chrismccord 2025-06-2015:5415 reply

Phoenix creator here. I'm happy to answer any questions about this! Also worth noting that phoenix.new is a global Elixir cluster that spans the planet. If you sign up in Australia, you get an IDE and agent placed in Sydney.

By tiffanyh 2025-06-2016:111 reply

Amazing work.

Just a clarifying question since I'm confused by the branding use of "Phoenix.new" (since I associate "Phoenix" as a web framework for Elixir apps but this seems to be a lot more than that).

- Is "Phoenix.new" an IDE?

- Is "Phoenix.new" ... AI to help you create an app using the Phoenix web framework for Elixir?

- Does "Phoenix.new" require the app to be hosted/deployed on Fly.io? If that's the case, maybe a naming like "phoenix.flyio.new" would be better and extensible for any type of service Fly.io helps in deployment - Phoenix/Elixir being one)

- Is it all 3 above?

And how does this compare to Tidewave.ai (created as presumably you know, by Elixir creator)

Apologies if I'm possibility conflating topics here.

By chrismccord 2025-06-2017:041 reply

Yes all 3. It has been weird trying to position/brand this as we started out just going for full-stack Elixir/Phoenix and it became very clear this is already much bigger than a single stack. That said, we wanted to nail a single stack super well to start and the agent is tailored for vibe'd apps atm. I want to introduce a pair mode next for more leveled assistance without having to nag it.

You could absolutely treat phoenix.new as your full dev IDE environment, but I think about it less an IDE, and more a remote runtime where agents get work done that you pop into as needed. Or another way to think about it, the agent doesn't care or need the vscode IDE or xterm. They are purely conveniences for us meaty humans.

For me, something like this is the future of programming. Agents fiddling away and we pop in to see what's going on or work on things they aren't well suited for.

Tidewave is focused on improving your local dev experience while we sit on the infra/remote agent/codex/devin/jules side of the fence. Tidewave also has a MCP server which Phoenix.new could integrate with that runs inside your app itself.

By tills13 2025-06-211:264 reply

> For me, something like this is the future of programming. Agents fiddling away and we pop in to see what's going on or work on things they aren't well suited for.

Honestly, this is depressing. Pop in from what? Our factory jobs?

By brainless 2025-06-215:00

I understand that we are slowly taking away our own jobs but I do not find it depressing. I do find it concerning since most people do not talk about this openly. We are not sure how we are restructure so many jobs. If we cannot find jobs, what is the financial future for a large number of people across the world. This needs more thinking, honest acceptance of the situation. It will happen, we should take a positive approach to finding a new future.

By scottrblock 2025-06-212:54

How about our software engineering jobs, which will now entail managing a team of agents?

By abrookewood 2025-06-212:21

Hopefully, from sitting by the pool drinking margaritas ... but I doubt we will get to keep our new found freedom.

By pmarreck 2025-06-212:26

> Pop in from what? Our factory jobs?

Oh, you sweet summer child. ;)

You will pop in from the other 9 projects you are currently popping in on, of course! While running 10 agents at once!

By joevandyk 2025-06-2018:431 reply

Is it possible to get that headless Chrome browser + agent working locally? With something like Cursor?

By tomashubelbauer 2025-06-2020:00

Playwright has an MCP server which I believe should be able to give you this.

By Snakes3727 2025-06-2017:502 reply

Hi just to confirm as I cannot find anything related to security or your use of using submitted code for training purposes. Where is your security policies with regards to that.

By mrkurt 2025-06-2018:201 reply

We don't do any model training, and only use existing open source or hosted models. Code gets sent to those providers in context windows. They all promise not to train on it, so far.

By tptacek 2025-06-2019:49

Did I not say it good enough, Kurt?

By tptacek 2025-06-2018:20

Ask some security questions, I'll get you security answers. We're not a model company; we don't "train" anything.

By miki123211 2025-06-2017:461 reply

1. What's your approach to accessibility? Do you test accessibility of the phoenix.new UI? Considering that many people effectively use Phoenix to write front-ends, have you conducted any evals on how accessible those frontends come out?

2. How do you handle 3rd party libraries? Can the agent access library docs somehow? Considering that Elixir is less popular than more mainstream languages, and hence has less training data available, this seems like an important problem to solve.

By chickensong 2025-06-2020:06

It seems like they're giving you lower level building building blocks here. It's up to the developer to address these things. Instruct the agent to build/test for accessibility, feed it docs via MCP or by other means.

By mrdoops 2025-06-2017:001 reply

Any takeaways on using Fly APIs for provisioning isolated environments? I'm looking into doing something similar to Phoenix.new but for a low-code server-less workflow system.

By chrismccord 2025-06-2017:081 reply

1 week of work to go from local-only to fly provisioned IDE machines with all the proxying. fly-replay is the unsung hero in this case, that's how we can route the *.phx.run urls to your running dev servers, how we proxy `git push` to phoenix.new to your IDE's git server, and how we frame your app preview within the IDE in a way that works with Safari (cross origin websocket iframes are a no go). We're also doing a bunch of other neat tricks involving object storage, which we'll write about at some point. Feel free to reach out in slack/email if you want to chat more.

By mrdoops 2025-06-2017:28

Thanks, I might hit you up when I'm in the weeds of that feature.

By finder83 2025-06-2017:071 reply

This looks amazing! I keep loving Phoenix more the more I use it.

I was curious what the pricing for this is? Is it normal fly pricing for an instance, and is there any AI cost or environment cost?

And can it do multiple projects on different domains?

By manmal 2025-06-2017:221 reply

It’s $20 per month if you click through, and I haven’t tried it but almost certainly the normal hosting costs will be added on top.

By finder83 2025-06-2021:02

Thanks, apparently didn't click through enough

By causal 2025-06-2016:351 reply

Do you have a package for calling LLM services we can use? This service is neat, but I don't need another LLM IDE built in Elixir but I COULD really use a way to call LLMs from Elixir.

By chrismccord 2025-06-2017:091 reply

Req.post to /chat/completions, streaming the tokens through a parser and doing regular elixir messages. It's really not more complicated than that :)

By throwawaymaths 2025-06-2018:10

even less complicated, just set stream: false in your json :)

By jonator 2025-06-2020:191 reply

I'm assuming you're using FLAME?

How do you protect the host Elixir app from the agent shell, runtime, etc

By chrismccord 2025-06-2020:391 reply

Not using FLAME in this case. The agent runs entirely separately from your apps/IDE/compute. It communicates with and drives your runtime over phoenix channels

By jonator 2025-06-2021:23

Oh interesting. So how do messages come from the container? Is there a host elixir app that is running the agent env? How does that work?

By arrowsmith 2025-06-2016:15

Thanks for everything you do Chris! Keep crushing it.

By cosmic_cheese 2025-06-2017:411 reply

How tightly coupled to Fly.io are generated apps?

By chrismccord 2025-06-2017:45

Everything starts as a stock phx.new app which use sqlite by default. Nothing is specific to fly. You should be able to copy the git clone url, paste, cd && mix deps.get && mix phx.server locally and the app will just work.

By b0a04gl 2025-06-2016:591 reply

how are they isolating ai agent state from app-level processes without breaking BEAM's supervision guarantees?

By chrismccord 2025-06-2017:43

They run on separate machines and your agent just controls the remote runtime when it needs to interact with the system/write/read/etc

By b0a04gl 2025-06-2016:56

this spins up a live elixir cluster per user session. how they handling hot code reloads across distributed nodes?

By b0a04gl 2025-06-2016:581 reply

the ai agent runs inside the same remote runtime as the app. does it share the BEAM vm or run as a port process?

By chrismccord 2025-06-2017:43

The agent runs outside your IDE instance and controls/communicates with it over Phoenix channels

By troupo 2025-06-2016:003 reply

I know it's early days, but here's a must-have wish list for me:

- ability to run locally somehow. I have my own IDE, tools etc. Browser IDEs are definitely not something I use willingly.

- ability to get all code, and deploy it myself, anywhere

---

Edit: forgot to add. I like that every video in Elixir/Phoenix space is the spiritual successor to "15-minute rails blog" from 20 year ago. No marketing bullshit, just people actually using the stuff they build.

By chrismccord 2025-06-2016:061 reply

You can push and pull code to and from local desktop already: hamburger menu => copy git clone/copy git push.

You could also have it use GitHub and do PRs for a codex/devin style workflows. Running phoenix.new itself locally isn't something we're planning, but opening the runtime for SSH access is high on our list. Then you could do remote ssh access with local vscode or whatever.

By prophesi 2025-06-2016:472 reply

> Running phoenix.new itself locally isn't something we're planning

So no plans to open the source code?

By abrookewood 2025-06-212:25

Everyone has to eat.

By chrismccord 2025-06-2017:09

confirm

By chrismccord 2025-06-2016:10

"15-minute rails blog" changed the game so I definitely resonate with this. My videos are pretty raw, so happy to hear it works for some folks.

By fridder 2025-06-2016:49

run locally or in your private cloud would be amazing. The latter bit would be a great paid option for large enterprises

By beepbooptheory 2025-06-2020:041 reply

What is the benefit of this vs. just running your agent of choice in any ole container?

By tptacek 2025-06-2020:161 reply

The whole post is about that. Not everything is for everybody, so if it doesn't resonate for you, that's totally OK.

By beepbooptheory 2025-06-2021:091 reply

Oh geez so sorry for the dumb question! I read a lot about the benefits of containerization in general for agents, but thought it might be enlightening/instructive to know what this specific project adds to that (other than the special Elixir-tuned prompting).

But either way I hear you, thanks so much for taking the time to set me straight. It seems like either way you have done some visionary things here and you should be content with your good work! This stuff does not work for me for just circumstantial reasons (too poor), but still always very curious about the stuff coming out!

Again, so sorry. Congrats on the release and hope your day is good.

By tptacek 2025-06-2021:141 reply

You're fine! Just encouraging people to read Chris's post. :)

By beepbooptheory 2025-06-2021:301 reply

Gotcha! I'll keep reading it I guess until I see what I am missing! Good job again!

By tptacek 2025-06-2021:401 reply

I did none of the work! I'm just like Flavor Flav or Bez in this situation. I will relay your congrats to Chris and the team, though. ;)

By kasey_junk 2025-06-2023:311 reply

Bad analogy. Bez was the best singer and most important member of that group.

By sorentwo 2025-06-210:081 reply

Shaun Ryder sets a low bar. Definitely the best maracas player though.

By tptacek 2025-06-211:56

Kurt's our Ryder.

By ativzzz 2025-06-2018:394 reply

This is very cool. I think the primary innovation here is twofold:

1. Remote agent - it's a containerized environment where the agent can run loose and do whatever - it doesn't need approval for user tasks because it's in an isolated environment (though it could still accidentally do destructive actions like edit git history). I think this alone is a separate service that needs to be productionized. When I run claude code in my terminal, automatically spin up the agent in an isolated environment (locally or remotely) and have it go wild. Easy to run things in parallel

2. Deep integration with fly. Everyone will be trying to embed AI deep into their product. Instead of having to talk to chatgpt and copy paste output, I should be able to directly interact with whatever product I'm using and interact with my data in the product using tools. In this case, it's deploying my web app

By brainless 2025-06-215:07

I have recently been working with Google Jules and it has a similar approach. It spins up VMs and goes through tasks given.

It does not handle any infrastructure, so no hosting. It allows me to set multiple small tasks, come back and check, confirm and move forward to see a new branch on GitHub. I open a PR, do my checks (locally if I need to) and merge.

By indigodaddy 2025-06-2019:21

look into Kasm workspaces.. great way to spin up remote docker-based linux desktops, and works great as an AI dev environment that you can use wherever you happen to be. There is homedir persistence, and package persistence can be achieved via some extra configuration that allows for Brew homedir-based package persistence.

https://hub.docker.com/r/linuxserver/kasm

https://www.reddit.com/r/kasmweb/comments/1l7k2o8/workaround...

By risyachka 2025-06-2023:291 reply

>> Remote agent - it's a containerized environment where the agent can run loose and do whatever

How is this innovation?

By kasey_junk 2025-06-2023:361 reply

Many people have not experienced the async agent workflow yet and fairly the major providers didn’t have offerings for them until a month or two ago.

It’s in fact one of my predictors for if they are going to be enthusiastic about agents or not.

And you wouldn’t think containerization would be a big leap but this stuff is so new and moving so fast that combining them with existing tech can surprise people.

By throwaway314155 2025-06-213:15

It's less innovative and more trendy. A lot of the fly integration can be achieved by simply asking claude code to look up the docs for the fly cli tool.