> I think the main issue here is the computational cost, as - if I understand co...

> I think the main issue here is the computational cost, as - if I understand correctly - you basically have to do training for each concept you want to learn. Are pretrained embeddings available anywhere for common words?

The basic SD model should have all the common words covered, this model's goal is to find a new concept that doesn't exist visually or textually in the dataset, like for example your own face, or a character you designed yourself. Note that this might not be possible to do, the corpus of data or the size of the model might not have held enough information that it can represent certain concepts, or at least represent them in detail. I.e. if you give it pictures of your dog, it might not look quite your dog during generation, even though those details existed in the pictures you gave the model.

If you want personalization that is also highly detailed, you'll have to fine tune the model itself with your own concepts, google has detailed how they did their own fine tuning and called it dreambooth[1].

[1] https://dreambooth.github.io/