...

zepearl

1049

Karma

2018-04-08

Created

Recent Activity

  • Using X (at least in this context?) is weird.

  • I downloaded Ollama ( https://github.com/ollama/ollama/releases ) and experimented with a few Qwen models ( https://huggingface.co/Qwen/collections ).

    My performance when using an RTX 5070 12GiB VRAM, Ryzen 7 9700X 8 cores CPU, 32GiB DDR5 6000MT (2 sticks):

      - "qwen2.5:7b": ~128 tokens/second (this model fits 100% in the VRAM).
      - "qwen2.5:32b": ~4.6 tokens/second.
      - "qwen3:30b-a3b": ~42 tokens/second (this is a MoE model with multiple specialized "brains") (this uses all 12GiB VRAM + 9GiB system RAM, but the GPU usage during tests is only ~25%).
      - qwen3.5:35b-a3b: ~17 tokens/second, but it's highly unstable and crashes -> currently not usable for me.
    
    So currently my sweet spot is "qwen3:30b-a3b" - even if the model doesn't completely fit on the GPU it's still fast enough. "qwen3.5" was disappointing so far, but maybe things will change in the future (maybe Ollama needs some special optimizations for the 3.5-series?).

    I would therefore deduce that the most important thing is the amount of VRAM and that performance would be similar even when using an older GPU (e.g. an RTX 3060 with as well 12GiB RAM)?

    Performance without a GPU, tested by using a Ryzen 9 5950X 16 cores CPU, 128GiB DDR4 3200 MT:

      - "qwen2.5:7b": ~9 tokens/second
      - "qwen3:32b": ~2 tokens/second
      - "qwen3:30b-a3b": ~16 tokens/second

  • 79 points33 commentsarstechnica.com

    A minimum wage gig in the 1990s turns into pretty much the Best Job Ever.

  • What about pre-December_2022? I cannot imagine that just a handful were imported.

  • > The main reason for this is lack of competition for DB in Germany

    Cannot be - there is no competition in Switzerland, but things run pretty smoothly -> in the case of Germany I'd rather say: "lack of oversight, controls, 'konsequent zu sein'" -> in the case of Germany's DB I think that nobody at all levels gives a *hit about its problems.

HackerNews