Founder of Sutro (https://sutro.sh/)
I run a batch inference/LLM data processing service and we do a lot of work around cost and performance profiling of (open-weight) models.
One odd disconnect that still exists in LLM pricing is the fact that providers charge linearly with respect to token consumption, but costs are actually quadratic with an increase in sequence length.
At this point, since a lot of models have converged around the same model architecture, inference algorithms, and hardware - the chosen costs are likely due to a historical, statistical analysis of the shape of customer requests. In other words, I'm not surprised to see costs increase as providers gather more data about real-world user consumption patterns.
Sutro.sh (fka Skysight) | Infrastructure/LLMs & Research Engineering | SF Bay Area | Full-time
We are building batch inference infrastructure and a great/user developer experience around it. We believe LLMs have not yet been meaningfully unlocked as data processing tools - we're changing that.
Our work involves interesting distributed systems and LLM research problems, newly-imagined user experiences, and a meaningful focus on mission and values.
Open Roles:
Infrastructure/LLM Engineer — https://jobs.skysight.inc/Member-of-Technical-Staff-Infrastr...
Research Engineer - https://jobs.skysight.inc/Member-of-Technical-Staff-Research...
If you're interested in applying, please send an email to jobs@sutro.sh with a resume/LinkedIn Profile. For extra priority, please include [HN] in the subject line.
Skysight | Infrastructure/LLMs & Research Engineering | SF Bay Area | Full-time
We are building large-scale batch inference infrastructure and a great/user developer experience around it. We believe LLMs have not yet been meaningfully unlocked as data processing tools - we're changing that.
Our work involves interesting distributed systems and LLM research problems, newly-imagined user experiences, and a meaningful focus on mission and values.
Open Roles:
Infrastructure/LLM Engineer — https://jobs.skysight.inc/Member-of-Technical-Staff-Infrastr...
Research Engineer - https://jobs.skysight.inc/Member-of-Technical-Staff-Research...
If you're interested in applying, please send an email to jobs@skysight.inc with a resume/LinkedIn Profile. For extra priority, please include [HN] in the subject line.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io