The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in language modeling. - facebookresearch/llm-speedrunner
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.