Building self-hosted web analytics.
Self-Hosted Analytics with Heatmaps and Session Recordings: https://www.uxwizz.com
WordPress Analytics: https://www.wplytic.com
X (Twitter): @XCSme
In my tests it's worst with adding extra formatting or output: https://aibenchy.com/compare/anthropic-claude-opus-4-6-mediu...
For example, sometimes it outputs in markdown, without being asked to (e.g. "**13**" instead of "13"), even when asked to respond with a number only.
This might be fine in a chat-environment, but not in a workflow, agentic use-case or tool usage.
Yes, it can be enforced via structured output, but in a string field from a structured output you might still want to enforce a specific natural-language response format, which can't be defined by a schema.
To be honest, I had this "issue" too.
I upgraded to a new model (gpt-4o-mini to grok-4.1-fast), suddenly all my workflows were broken. I was like "this new model is shit!", then I looked into my prompts and realized the model was actually better at following instructions, and my instructions were wrong/contradictory.
After I fixed my prompts it did exactly what I asked for.
Maybe models should have another tuneable parameters, on how well it should respect the user prompt. This reminds me of imagegen models, where you can choose the config/guidance scale/diffusion strength.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io