Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This looks great. What hardware do you use, or have you tested it on?


I only tested it on my 4090 so far


Are you using all local models, or does it also use cloud inference? Proprietary models?

Which models are running in which places?

Cool utility!


All local models: - VAD: Webrtcvad (first fast check) followed by SileroVAD (high compute verification) - Transcription: base.en whisper (CTranslate2) - Turn Detection: KoljaB/SentenceFinishedClassification (selftrained BERT-model) - LLM: hf.co/bartowski/huihui-ai_Mistral-Small-24B-Instruct-2501-abliterated-GGUF:Q4_K_M (easily switchable) - TTS: Coqui XTTSv2, switchable to Kokoro or Orpheus (this one is slower)


That's excellent. Really amazing bringing all of these together like this.

Hopefully we get an open weights version of Sesame [1] soon. Keep watching for it, because that'd make a killer addition to your app.

[1] https://www.sesame.com/


That would be absolutely awesome. But I doubt it, since they released a shitty version of that amazing thing they put online. I feel they aren't planning to give us their top model soon.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: