Hacker News new | past | comments | ask | show | jobs | submit login

There already exist comparable EleutherAI models, I believe. Not as good, but pretty good.



The biggest I've found is GPT-J (EleutherAI/gpt-j-6B), which has a model size comparable to GPT-3 Curie, but the outputs have been very weak compared to what I'm seeing people do with GPT-3 Da Vinci. The outputs feel like GPT-2 quality. I'm probably using it wrong, or maybe there are better BART models published that I don't know about?

> Write a brief post explaining how GPT-J is as capable as GPT-3 Curie and GPT-2, but not as good as GPT-3 Da Vinci. GPT-J ia a new generation of GPT-3 Curie and GPT-2. It is a new generation of GPT-3 Curie and GPT-2. It is a new generation of GPT-3 Curie and GPT-2. sentence repeats

Using temperature 1e-10, top_p 1.


The existing models aren't fine tuned for question answering, which is what makes GPT-3 usable. Eleuther or one of those other Stability collectives is working on one.


It's very sad how they had to nerf the model (AIDungeon and stuff). I don't think anything on a personal / consumer GPU could rival a really big model.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: