Hacker News new | past | comments | ask | show | jobs | submit login

When the generative model is autoregressive (autocomplete), it can easily be used as a predictor. All of the state of the art language models are tested against multiple choice exams and other types of prediction tasks. In fact, it's how they are trained...masking - https://www.microsoft.com/en-us/research/blog/mpnet-combines...

For example: "Multiple-choice questions in 57 subjects (professional & academic)" - https://openai.com/research/gpt-4




Being good at standardized tests isn't really a good measure.

What happens with completely new questions from totally different subject. The generative model will produce nonsense.


For GPT4: "Pricing is $0.03 per 1,000 “prompt” tokens (about 750 words) and $0.06 per 1,000 “completion” tokens (again, about 750 words)."

Meanwhile, there are off-shelf models that you can train very efficiently, on relevant data, privately, and you can run these on your own infrastructure.

Yes, GPT4 is probably great at all the benchmark tasks, but models have been great at all the open benchmark tasks for a long time. That's why they have to keep making harder tasks.

Depending on what you actually want to do with LMs, GPT4 might lose to a BERTish model in a cost-benefit analysis--especially given that (in my experience), the hard part of ML is still getting data/QA/infrastructure aligned with whatever it is you want to do with the ML. (At least at larger companies, maybe it's different at startups.)


This is true, but the humans to develop the non-LLM solution are expensive and the OpenAI API is easy.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: