Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The flattened image patch of width and height PxP pixels gets multiplied with a learnable matrix of dimension P^2xD where D is the size of the patch embedding. In other words, it’s a linear transformation that reduces the dimensionality of the image patch.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: