My email is at gmail using my username.
Backend-focused full-stack engineer with 8+ years of experience, mostly focused on Go, Rust, and TypeScript, but open to other options.
Resume: https://drive.google.com/file/d/1VNC272B3n7ZEfppMHkm2wGgaINwYl4Av/view
Parakeet is insanely fast and much more accurate, and it doesn't really matter that Whisper requires hacks to work live when those hacks have existed for years and work great. (The Hello Transcribe app on iOS is a great example of how well Whisper can work with live streaming on an iPhone. The smaller models are extremely fast, even with the "hacks".)
Parakeet TDT's architecture is actually a really cool way to boost both the speed and efficiency of real time STT compared to traditional approaches.
Terrible relative to everything else that exists today. I have a neutral American accent.
Maybe you just don’t know what you’re missing? Google’s default speech to text is still bad compared to Whisper and Parakeet, but even Google’s is markedly better than Apple’s.
I cannot think of a single speech to text system that I’ve run into in the past 5 years that is less accurate than the one Apple ships.
Sure, Apple’s speech to text is incredible compared to what was on the flip phone I had 20 years ago. Terrible is relative. Much better options exist today, and they’re under very permissive licenses. Apple’s refusal to offer a better, more accessible experience to their users is frustrating when they wouldn’t even have to pay a licensing fee to ship something better. Whisper was released under a permissive license nearly 4 years ago.
Apple also restricts third party keyboards to an absurdly tiny amount of memory, so it isn’t even possible to ship a third party keyboard that provides more accurate on-device speech to text without janky workarounds (requiring the user to open the keyboard's own app first each time).
> Siri/iOS-Dictation is truly good when it comes to understanding the speech.
What...? It is terrible, even compared to Whisper Tiny, which was released years ago under an Apache 2.0 license so Apple could have adopted it instantly and integrated it into their devices. The bigger Whisper models are far better, and Parakeet TDT V2 (English) / V3 (Multilingual) are quite impressive and very fast.
I have no idea what would make someone say that iOS dictation is good at understanding speech... it is so bad.
For a company that talks so much about accessibility, it is baffling to me that Apple continues to ship such poor quality speech to text with their devices.
This project is an enhanced reader for Ycombinator Hacker News: https://news.ycombinator.com/.
The interface also allow to comment, post and interact with the original HN platform. Credentials are stored locally and are never sent to any server, you can check the source code here: https://github.com/GabrielePicco/hacker-news-rich.
For suggestions and features requests you can write me here: gabrielepicco.github.io