...

tyfon

3966

Karma

2014-09-22

Created

Recent Activity

  • It would be really nice to have an option to not do this since a ton of companies deny VMs in their group policies.

  • I didn't really understand the performance table until I saw the top ones were 8B models.

    But 5 seconds / token is quite slow yeah. I guess this is for low ram machines? I'm pretty sure my 5950x with 128 gb ram can run this faster on the CPU with some layers / prefill on the 3060 gpu I have.

    I also see that they claim the process is compute bound at 2 seconds/token, but that doesn't seem correct with a 3090?

  • Commented: "Gemini 3.1 Pro"

    I was using gemini antigravity in opencode a few weeks ago before they started banning everyone for that and I got into the habit of writing "do x, then wait for instructions".

    That helped quite a bit but it would still go off on it's own from time to time.

HackerNews