gentleman scientist. machine learning and natural language processing.
Joseph Turian
lastname at gmail dot com
http://www.linkedin.com/in/turian
http://strictlydev.com/devs/turian
Sorry, I side with GP. Just because you don't want to use Llama/GPT because of cost, the middle-ground of DistilBERT etc (which can run on a single CPU) is a much more sensible cost/benefit tradeoff than VADER's decade old lexicon-based approach.
I can't really think of many NLP things that are one-decade old and don't have a better / faster / cheaper alternative.
I want LLM accessible bookmarks. That's it.
It doesn't work yet.
I use singlefile to archive pages I'm viewing Linkding.
Then I have a BeautifulScript4 script to strip the assets.
Then I use Jina's ReaderLM v2 to render the HTML to proper Markdown: https://huggingface.co/jinaai/ReaderLM-v2
Except, of course, for longer table oriented text documents like HN that doesn't work.
I want a plaintext archive of web pages in a github repo or similar. Not a fancy UI/UX
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io