Ask HN: What happened to self-hosted models?

2026-01-1013:19312

Hi HN, sorry for using a burner account.

It seems to me that up until the beginning of the last year, we saw a couple of new "open" model release announcements almost every week. They'd set a new state of the art for what an enthusiast could run on their laptop or home server.

Met...

Hi HN, sorry for using a burner account.

It seems to me that up until the beginning of the last year, we saw a couple of new "open" model release announcements almost every week. They'd set a new state of the art for what an enthusiast could run on their laptop or home server.

Meta, Deepseek, Mistral, Qwen, even Google etc. were publishing new models left and right. There were new formats, quantizations, inference engines etc. and most importantly - a lot of discourse and excitement around them.

Quietly and suddenly, this changed. After the release of gpt-oss (August 2025), the discourse has been heavily dominated around hosted models now. I don't think I've seen any mention of Ollama in any discussion that reached HN's front page in the last 6 months.

What gives? Is this a proxy signal that we've hit a barrier in LLM efficiency?


Comments

  • By lioeters 2026-01-1113:29

    A recent local model I tried is Ministral 3 from a month ago. https://mistral.ai/news/mistral-3

        Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
        Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
        ...
        Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
        Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
        Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
        Large Context Window: Supports a 256k context window.

  • By potsandpans 2026-01-1017:48

    They're still going. I just bought a 5090 for myself this Christmas to do more interesting things.

    I mostly use them for game assets.

    Trellis2 is very cool. Ive managed to put together a sdxl -> trellis -> unirig pipeline to generate 3d characters with mixamo skeletons that's working pretty well.

    On the llm front, deepseek and qwen are still cranking away. Qwen3 a22b instruct, imho does a better job than gemini in some cases with ocr and translation of handwritten documents.

    The problem with these frontier open weight models is that running them locally is not exactly tenable. You either have to get a cloud GPU instance, or go through a provider.

    - https://github.com/microsoft/TRELLIS.2 - https://github.com/VAST-AI-Research/UniRig

  • By al_borland 2026-01-1013:241 reply

    My wildly uneducated guess is that they are getting to the point where they need to figure out how to profit off all this investment, and releasing self-hosted open-source models isn’t going to help them do that.

    • By curiousaboutml 2026-01-1013:26

      Possibly, but it's not just the release of new models. It seems the community itself has lost interested in self-hosted models.

HackerNews