Among arxiv publications there are 217 results that contain "large language model" in the full text and "from scratch" in the title or abstract.
There are 2873 results that contain "large language model" in the full text and use "pretrained" in the title or abstract. A 10x difference in publication count does make one feel more common than the other?
I'd need to get into more involved queries to break down the semantic categories of those papers.