Hacker News new | past | comments | ask | show | jobs | submit login

hardly anybody you are talking to even knows what gpt3 is, the time between 3.5 and 4 is what is relevant



It doesn't make any sense to look at it that way. Apparently the GPT base model finised training in like late summer 2022, which is before the release of GPT-3.5. I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era. The advancement from GPT-3 to GPT-4 is what counts and it took 3 years.


I fully don't agree.

> I am pretty sure that GPT-3.5 should be thought of as GPT-4-lite, in the sense that it uses techniques and compute of the GPT-4 era rather than the GPT-3 era

Compute of the "GPT-3 era" vs the "GPT-3.5 era" is identical, this is not a distinguishing factor. The architecture is also roughly identical, both are dense transformers. The only significant difference between 3.5 and 3 is the size of the model and whether it uses RLHF.


Yes you're right about the compute. Let me try to make my point differnetly: GPT-3 and GPT-4 were models which when they were released represented the best that OpenAI could do, while GPT-3.5 was an intentionally smaller (than they could train) model. I'm seeing it as GPT-3.5 = GPT-4-70b. So to estimate when the next "best we can do" model might be released we should look at the difference between the release of GPT-3 and GPT-4, not GPT-4-70b and GPT-4. That's my understanding, dunno.


GPT-4 only started training roughly at the same time/after the release of GPT-3.5, so I'm not sure where you're getting the "intentionally smaller".


Ah I misremembered GPT-3.5 as being released around the time of ChatGPT.


oh you remembered correctly, those are the same thing

actually i was wrong about when gpt-4 started training, the time i gave was roughly when they finished




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: