Hacker News new | past | comments | ask | show | jobs | submit login

Not the OP but I've been tinkering with the same concept (24/7 processing).

'm using vosk browser: https://github.com/ccoreilly/vosk-browser

To do speech to text locally and it works very well for English.




Browser models are too small, unlikely they recognize accurately. They are more for simple predefined phrase.

You can probably try vosk-api on the desktop-grade machine. You need big models from https://alphacephei.com/vosk/models, they require like 8Gb to run but they are much more accurate.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: