reminds me of the anthropic's recent work on identifying the neuron sets that correlate to various semantic concepts in Claude: https://news.ycombinator.com/item?id=40429540 "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet"