...

adwn

5833

Karma

2013-11-05

Created

Recent Activity

  • > EXO Labs showed otherwise by getting a 300K-parameter LLM to run on a Pentium II with only 128 MB of RAM at about 50 tokens per second

    50 token/s is completely useless if the tokens themselves are useless. Just look at the "story" generated by the model presented in your link: Each individual sentence is somewhat grammatically correct, but they have next to nothing to do with each other, they make absolutely no sense. Take this, for example:

    "I lost my broken broke in my cold rock. It is okay, you can't."

    Good luck tuning this for turn-based conversations, let alone for solving any practical task. This model is so restricted that you couldn't even benchmark its performance, because it wouldn't be able to follow the simplest of instructions.

  • > a small LLM, say, with 200–300K weights

    A "small Large Language Model", you say? So a "Language Model"? ;-)

    > Such an LLM could have handled grammar and code autocompletion, basic linting, or documentation queries and summarization.

    No, not even close. You're off by 3 orders of magnitude if you want even the most basic text understanding, 4 OOM if you want anything slightly more complex (like code autocompletion), and 5–6 OOM for good speech recognition and generation. Hardware was very much a limiting factor.

  • > Good. My text document viewer only needs to render text in straight lines left to right.

    Yes, inconceivable that somebody might ever want to render text in anything but a "text document viewer"!

  • When `free()` is called, the allocator internally marks that specific memory area as unused, but it doesn't necessarily return that area back to the OS, for two main reasons:

    1. `malloc()` is usually called with sizes smaller than the sizes by which the allocator requests memory from the OS, which are at least page-sized (4096 bytes on x86/x86-64) and often much larger. After a `free()`, the freed memory can't be returned to the OS because it's only a small chunk in a larger OS allocation. Only after all memory within a page has been `free()`d, the allocator may, but doesn't have to, return that page back to the OS.

    2. After a `free()`, the allocator wants to hang on to that memory area because the next `malloc()` is sure to follow soon.

    This is a very simplified overview, and different allocators have different strategies for gathering new `malloc()`s in various areas and for returning areas back to the OS (or not).

  • I can't parse either of your sentences. Maybe you could introduce some intermediate variables, or use parentheses to give them structure?

HackerNews