...

xscott

1378

Karma

2019-08-05

Created

Recent Activity

  • Thank you for this. I had Antigravity already but was thinking of cancelling it because Gemini frustrates me. Using it with Claude though was very impressive. I burned through my token budget in about 5 hours though.

  • That is interesting, and thank you.

    I've had a list of pet projects that I've been adding to for years. For those, I just say the broad strokes and tell it to do it's best. Codex has done a really good job for most of them, sometimes in one shot, and my list of experiments is emptying. Only one notable exception where it had no idea what I was after.

    I also have my larger project, which I hope to actually keep and use it. Same thing though, it's really hard to explain what's going on, and it acts on bad assumptions.

    So if Claude is better at that, then having two tools makes a lot of sense to me.

  • Can you expand on that. I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.

    And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.

    Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.

  • I think it would be cool if a language specifically for LLMs came about. It should have something like required preconditions and postconditions so that a deterministic compiler can verify the assumptions the LLM is claiming. Something like a theorem prover, but targeted specifically for programming and efficient compilation/runtime. And it doesn't need all the niceties human programmers tend to prefer (implicit conversions comes to mind).

  • Of course I can't be certain, but I think the "mixture of experts" design plays into it too. Metaphorically, there's a mid-level manager who looks at your prompt and tries to decide which experts it should be sent to. If he thinks you won't notice, he saves money by sending it to the undergraduate intern.

    Just a theory.

HackerNews