Hacker News new | past | comments | ask | show | jobs | submit | more thomas536's comments login

Sorry to hijack this thread, but too curious if you made any headway from your "Non-cloud voice recognition for home use?" post? Sounded like an interesting use-case.

tia


+1 I appreciate your answer!


(Yeah the exact billing model is fungible/an example to get people to think in terms of actually willing to fork over money.)

Do you mean you might have a specific and relatively small set of videos that you would like searched, but at a much lower price point (e.g. 10k videos at less than $10/year)?


The only public search engine that I'm aware of using common crawl is ChatNoir.

https://groups.google.com/d/msg/common-crawl/3o2dOHpeRxo/H2O...


Most searching (e.g. web, email, torrents) is based on ads, which works because of the utility to a huge number of people. There are many things one might in passing want to be able to search (e.g. quickly finding the location of particular book on your disorganized bookshelf, non-web text content, etc.) but don't have mass appeal to sustain ads revenue. And even many of those wants aren't worth paying a few dollars for let alone the cost of hosting search.

So does anyone in the HN crowd have a search-based itch they're willing to pay to have scratched that's not already handled by existing search products?


I can do this as well (recall where a stray phone is), but don't understand why I notice and remember better than others. Do I remember more "that's out of place" instances, am I more predisposed to prefer things be in a certain place? Not sure.


If people's public bookmarks aren't available online, I'd encourage you to publicly post them somewhere. Even if they aren't useful to you anymore, they are good signal for good content, which I think is worth trying to preserve for future use.


I'll second the recommendation for Kaldi. It's more complicated to get running vs pocketsphinx, but in my experience Kaldi has better accuracy/lower latency in general cases vs pocketsphinx (assuming caveats below).

https://github.com/gooofy/zamia-speech/ has been training good [acoustic] models which are worth looking at (including training with robustness against noise). They've also got lots of code and docker images and documentation.

pocketsphinx isn't actually that bad to use with their latest acoustic models and small vocabularies (so its utility depends on your exact use case). But it's not generally good with far field mics/dsp processed audio, not really good with noise, and in my experiments quite not as fast as Kaldi.

Better/larger language models in my experience make a world of difference (esp in the general vocab case) for improving accuracy for either of kaldi or pocketsphinx. Nobody really seems to talk about this(?), since everyone always uses the news corpus from like the 80s as the default language model.

I haven't really ever gotten the various ~deepspeech systems working, so I can't speak to them.


I'm happy to feed it plenty of voice logs as well as a training corpus as necessary. Sounds like an interesting journey.



I must be missing something because 6.5 days * $21/day = $136.5

"""

The entire process now took 6.5 days and cost $21/day. Our total cost all said and done was $115!

"""


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: