...

soleveloper

38

Karma

2025-08-06

Created

Recent Activity

  • There are incredible authors who happen to be dyslexic, and brilliant mathematicians who struggle with basic arithmetic. We don't dismiss their core work just because a minor lemma was miscalculated or a word was misspelled. The same logic applies here: if we dismiss the semantic capabilities of these models based entirely on their token-level spelling flaws, we miss out on their actual utility.

  • Treat LLMs as dyslexic when it comes to spelling. Assess their strengths and weaknesses accordingly.

  • Will that protect you from the agent changing the code to bypass those safety mechanisms, since the human is "too slow to respond" or in case of "agent decided emergency"?

  • Yes, and even holding couple of cartridges for different scenarios e.g image generation, coding, tts/stt, etc

  • There are so many use cases for small and super fast models that are already in size capacity -

    * Many top quality tts and stt models

    * Image recognition, object tracking

    * speculative decoding, attached to a much bigger model (big/small architecture?)

    * agentic loop trying 20 different approaches / algorithms, and then picking the best one

    * edited to add! Put 50 such small models to create a SOTA super fast model

HackerNews