Been working on an autoencoder that converts the hidden states of transformer models into a spatial representation that can be visualized. Started more on the toy scale but now I'm trying to scale it beyond my humble 3060. Using LLMs to help with torch and such but they are limited in the details of tensor twiddling.
https://github.com/ristew/weightscan