I fully don't agree. > I am pretty sure that GPT-3.5 should be thought of as GPT...

golol · 2024-05-13T20:30:45 1715632245

Yes you're right about the compute. Let me try to make my point differnetly: GPT-3 and GPT-4 were models which when they were released represented the best that OpenAI could do, while GPT-3.5 was an intentionally smaller (than they could train) model. I'm seeing it as GPT-3.5 = GPT-4-70b. So to estimate when the next "best we can do" model might be released we should look at the difference between the release of GPT-3 and GPT-4, not GPT-4-70b and GPT-4. That's my understanding, dunno.

whimsicalism · 2024-05-13T20:38:09 1715632689

GPT-4 only started training roughly at the same time/after the release of GPT-3.5, so I'm not sure where you're getting the "intentionally smaller".

golol · 2024-05-13T21:05:46 1715634346

Ah I misremembered GPT-3.5 as being released around the time of ChatGPT.

whimsicalism · 2024-05-13T21:39:41 1715636381

oh you remembered correctly, those are the same thing

actually i was wrong about when gpt-4 started training, the time i gave was roughly when they finished