GPT-J is 6B and comes pretty close. Also practically I haven’t noticed a difference.
Keep in mind there are also closed source alternatives: for example, AI21’s Jurassic-1 models are comparable, cheaper, and technically larger (albeit somewhat comically, 178B instead of 175B parameters).
Keep in mind there are also closed source alternatives: for example, AI21’s Jurassic-1 models are comparable, cheaper, and technically larger (albeit somewhat comically, 178B instead of 175B parameters).