French, researcher, engineer, consulting mentalist.
Specialized in artificial intelligence, high-performance computing, and floating point arithmetic.
Nestor Demeure (https://nestordemeure.github.io/about/)
On alternative ways to measure LLM intelligence, we had good success with this: https://arxiv.org/abs/2509.23510
In short: start with a dataset of question and answer pairs, where each question has been answered by two different LLMs. Ask the model you want to evaluate to choose the better answer for each pair. Then measure how consistently it selects winners. Does it reliably favor some models over the questions, or does it behave close to randomly? This consistency is a strong proxy for the model’s intelligence.
It is not subject to dataset leaks, lets you measure intelligence in many fields where you might not have golden answers, and converges pretty fast making it really cheap to measure.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io