> I wonder if there's a way to automatically detect how "fast" a person talks in... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		janalsncm 77 days ago \| parent \| context \| favorite \| on: OpenAI charges by the minute, so speed up your aud... > I wonder if there's a way to automatically detect how "fast" a person talks in an audio file Transcribe it locally using whisper and output tokens/sec?

maxall4 77 days ago [–]

Just count syllables per second by doing an FFT plus some basic analysis.

tucnak 76 days ago | [–]

> FFT plus some basic analysis

Yeah, totally easier than `len(transcribe(a))/len(a)`

janalsncm 76 days ago | | [–]

Maybe not as quick to code up but way faster to calculate.

The tokens/second can be used as ground truth labels for a fft->small neural net model.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact