Hacker News new | past | comments | ask | show | jobs | submit login

Nice! How's the speech recognition accuracy and response latency?



Thanks!

Faster than Alexa (and only going to get faster)[0].

Between the far-field speech optimizations provided by the ESP BOX and Espressif frameworks and our inference server (open sourcing next week) using Whisper, and our unique streaming format we've found it to be comparable in terms of quality to Alexa/Echo even with background noise and at distances of up to 30 feet.

[0] - https://www.youtube.com/watch?v=8ETQaLfoImc


That's really nice - and thanks for including the demo link too, impressive!


Thanks again!

Not only are we working on improving performance with the inference server, local on device command recognition is extremely fast. Like "did that really just happen?" fast.

In my local setup when using locally-controlled Wemo switches I swear the latency with local devices is around 300ms or so.

I should make another demo video with that...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: