...

mchiang

696

Karma

2013-02-24

Created

Recent Activity

  • hey, thanks for sharing. I had to go to the Twitter feed to find the GitHub link:

    https://github.com/21st-dev/1code

  • this one is exciting. It'll enable and accelerate a lot of devices on Ollama - especially around AMD GPUs not fully supported by ROCm, Intel GPUs, and iGPUs across different hardware vendors.

  • I am super hopeful! Hardware is improving, inference costs will continue to decrease, models will only improve...

  • Qwen3-coder:30b is in the blog post. This is one that most users will be able to run locally.

    We are in this together! Hoping for more models to come from the labs in varying sizes that will fit on devices.

  • ah! This must be downloaded from elsewhere and not from Ollama? So sorry about this.

    To help future optimizations for given quantizations, we have been trying to limit the quantizations to ones that fit for majority of users.

    In the case of mistral-small3.1, Ollama supports ~4bit (q4_k_m), ~8bit (q8_0) and fp16.

    https://ollama.com/library/mistral-small3.1/tags

    I'm hopeful that in the future, more and more model providers will help optimize for given model quantizations - 4 bit (i.e. NVFP4, MXFP4), 8 bit, and a 'full' model.

HackerNews