Hacker News new | past | comments | ask | show | jobs | submit login

Interestingly Google was using ~2000 experts back in the first Trasnformer architecture (if I understand correctly) https://www.youtube.com/watch?v=9P_VAMyb-7k&t=6m42s [sparsely-gated mixture of experts layer]



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: