I'm not sure how you could train theory of mind into the capabilities of just LLMs. It seems they would need something more complex than attention based completion/prediction to model it efficiently.
Edit: By more complex I mean more deliberate / directed. Predicting the next word(s) is a concrete goal -- "training" a theory of mind sounds like "training" sentience. Sure... But how?
I’d say by training on a lot of contextual situation: what is person thinking pairs. It’s stuff that will be grossly underrepresented in content simply because it is “obvious “ and normally not mentioned. That would probably be enough to bring audience awareness into the worldstate of the token-space.
I don't know. Humor is a very high dimensional space in itself and different to each individual. I would be surprised if any LLM could be trained to be a consistently well reviewed comedian. "Humor" is just such a nebulous metric.
Edit: By more complex I mean more deliberate / directed. Predicting the next word(s) is a concrete goal -- "training" a theory of mind sounds like "training" sentience. Sure... But how?