How come you don't support audio files longer than 1hr? Is it because of $$ cost?
The above demo app gets faster transcription by chunking audio and parellelizing over dozens of CPUs, so you can transcribe a about 1hr of audio for $0.10.
Interesting, which model are you using? We use the medium model which is the sweet spot between time/performance ratio. We also chunk, We try to detect words and silences to do better chunking at word boundaries but if you do more chunking and you don't get the word boundaries right it seems like whisper loses some context and the accuracy suffers. We will soon support longer hours. We just want to make sure the wait time for transcription doesn't suffer for most users. But great demo, reach out to me if you want to collaborate
How come you don't support audio files longer than 1hr? Is it because of $$ cost?
The above demo app gets faster transcription by chunking audio and parellelizing over dozens of CPUs, so you can transcribe a about 1hr of audio for $0.10.