Hacker News new | past | comments | ask | show | jobs | submit login

GPT-4 is an amazing achievement, however, it is just a language model. LLM (large language models) are well documented in literature and GPT-4 is just a much larger version (more parameters) of these LLM models. Training of LLM models is also well documented. GPT-4 just has been trained on a very large subset of the Internet.

Of course there are proprietary models, that will be improved versions of the academic LLM models, however, there are no big secrets or mysteries.




The individual components are well documented, but which specific arrangements produce the best results is still very much an active research area.

As far as training, the differences between GPT-3 and GPT-3.5 (the latter being a smaller model!) demonstrate just how much fine tuning and reinforcement learning is important to the quality of the model. Merely throwing more content from the Internet at it doesn't automatically improve things.


I'm not so sure about this. There is speculation that GPT4 may utilize additional specialized models underneath it for specific tasks.


Exactly, this has been my guess as well. They must have trained the model specifically to write poems, haikus and other things. So that the output looks much more polished than it really is.


I for one want a module that plays around with physics models and proposes practicable FTL-capable mechanisms.


That'd require a real AI and not just a LLM, haha.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: