Hacker News new | past | comments | ask | show | jobs | submit login

Perhaps someone can answer this: this is a one year old company. Does this mean that barriers to entry are low and replication relatively simple?



The part of Meta research that worked on LLaMa happened to be based in the Paris office. Then some of the leads left and started Mistral.

Complex/simple is not really the right way to think about training these models, I'd say its more arcane. Every mistake is expensive because it takes a ton of GPU time and/or human fine tuning time. Take a look at the logbooks of some of the open source/research training runs.

So these engineers have some value as they've seen these mistakes (paid for by Meta's budget).


Main barrier right now is access to supercompute and how to run it, everything is standardising quickly in the space




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: