"From scratch" is commonly used in the field.

popinman322 · on Nov 6, 2023

Among arxiv publications there are 217 results that contain "large language model" in the full text and "from scratch" in the title or abstract.

There are 2873 results that contain "large language model" in the full text and use "pretrained" in the title or abstract. A 10x difference in publication count does make one feel more common than the other?

I'd need to get into more involved queries to break down the semantic categories of those papers.

light_hue_1 · on Nov 6, 2023

From scratch simply means that they didn't base it off some other llm.

This is perfectly good language and exactly the correct thing for them to say.