mchiang

2026-01-22 8:59

Commented: "Show HN: First Claude Code client for Ollama local models"

hey, thanks for sharing. I had to go to the Twitter feed to find the GitHub link:

2025-10-16 8:02

Commented: "New coding models and integrations"

this one is exciting. It'll enable and accelerate a lot of devices on Ollama - especially around AMD GPUs not fully supported by ROCm, Intel GPUs, and iGPUs across different hardware vendors.

2025-10-16 7:26

Commented: "New coding models and integrations"

I am super hopeful! Hardware is improving, inference costs will continue to decrease, models will only improve...

2025-10-16 7:15

Commented: "New coding models and integrations"

Qwen3-coder:30b is in the blog post. This is one that most users will be able to run locally.

We are in this together! Hoping for more models to come from the labs in varying sizes that will fit on devices.

2025-10-16 6:52

Commented: "New coding models and integrations"

ah! This must be downloaded from elsewhere and not from Ollama? So sorry about this.

To help future optimizations for given quantizations, we have been trying to limit the quantizations to ones that fit for majority of users.

In the case of mistral-small3.1, Ollama supports ~4bit (q4_k_m), ~8bit (q8_0) and fp16.

https://ollama.com/library/mistral-small3.1/tags

I'm hopeful that in the future, more and more model providers will help optimize for given model quantizations - 4 bit (i.e. NVFP4, MXFP4), 8 bit, and a 'full' model.

Hacker News

mchiang

696

2013-02-24

Recent Activity

Commented: "Show HN: First Claude Code client for Ollama local models"

Commented: "New coding models and integrations"

Commented: "New coding models and integrations"

Commented: "New coding models and integrations"

Commented: "New coding models and integrations"

HackerNews