Hacker News new | past | comments | ask | show | jobs | submit login

All the words they have experienced up to that point are part of the training set, as well as all the people and things they have seen.



Even if people around the 3-year old child talk to it 16 hours per day constantly at 150 words per minute, they'd just have around 1GB of text in its training data. And not good quality words even, a lot of it would be variations of mundane everyday chit chat and "whose a cute baby?! You're a cute baby!".

For comparison GPT has like 1TB of text, and they're hundreds of thousands of books, articles, wikipedia, and so on. So already 3 orders of magnitude more.

And of course the "16 hours x 150 words per minute x 3 years" is totally off by a few orders of magnitude itself.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: