This looks great. What hardware do you use, or have you tested it on?

koljab · 2025-05-05T20:42:57 1746477777

I only tested it on my 4090 so far

echelon · 2025-05-05T21:31:26 1746480686

Are you using all local models, or does it also use cloud inference? Proprietary models?

Which models are running in which places?

Cool utility!

koljab · 2025-05-05T21:49:37 1746481777

All local models: - VAD: Webrtcvad (first fast check) followed by SileroVAD (high compute verification) - Transcription: base.en whisper (CTranslate2) - Turn Detection: KoljaB/SentenceFinishedClassification (selftrained BERT-model) - LLM: hf.co/bartowski/huihui-ai_Mistral-Small-24B-Instruct-2501-abliterated-GGUF:Q4_K_M (easily switchable) - TTS: Coqui XTTSv2, switchable to Kokoro or Orpheus (this one is slower)

echelon · 2025-05-06T03:16:57 1746501417

That's excellent. Really amazing bringing all of these together like this.

Hopefully we get an open weights version of Sesame [1] soon. Keep watching for it, because that'd make a killer addition to your app.

[1] https://www.sesame.com/

koljab · 2025-05-07T12:58:42 1746622722

That would be absolutely awesome. But I doubt it, since they released a shitty version of that amazing thing they put online. I feel they aren't planning to give us their top model soon.