It depends on your data and your embedding model. For example, I was able to quantize embeddings of English Wikipedia from 384-dimensions down to 48 7-bit dimensions, and the search works great: https://www.leebutterman.com/2023/06/01/offline-realtime-emb...