...

bethekind

216

Karma

2023-04-04

Created

Recent Activity

  • Jane Street had a cool video about how you can address lack of training data in a programming language using llm patching. Video is called "Arjun Guha: How Language Models Model Programming Languages & How Programmers Model Language Models"

    The big take away is that you can "patch" llms and steer them to correct answers in less trained programming languages, allowing for superior performance. Might work here. Not a clue how to implement, but stuff to llm-to-doc and the like makes me hopeful

  • Model interpretability is going to be the final frontier of software. You used to need to debug the code. Now you'll need to debug the AI.

  • This is draconian.

    > Our investigation specifically confirmed that the use of your credentials within the third-party tool “open claw” for testing purposes constitutes a violation of the Google Terms of Service [1]. This is due to the use of Antigravity servers to power a non-Antigravity product. I must be transparent and inform you that, in accordance with Google’s policy, this situation falls under a zero tolerance policy, and we are unable to reverse the suspension. I am truly sorry to share this difficult news with you.

  • Commented: "DeepSeek OCR"

    Did we read the same graph? DeepSeek Gundam 200 dpi appeared to get similar perf as dots-ocr, but with less tokens needed. The x axis is inverted, descending with distance from the origin.

HackerNews