My understanding is they are both "LLM" (Large language models). That's the gene...

My understanding is they are both "LLM" (Large language models). That's the generic term you are looking for.

I don't think you can compare one LLMs weights to another directly, because the weights are a product of the LLM. In theory (I don't know actually) llama and chatGPT may be using different source datasets so you can't compare them like for like.