There are now encoding even more efficient than VLQ, but they blue the line in between encodings and compression algorithms. Most propose to trade efficiency of encoding 7 bit chars for ability to squeeze few thousands common Chinese characters into 16 bits.
The idea I heard was that to always code in 4 bytes blocks, and use some form of delta encoding. Some variations allow for less than OlogN character position search. And given that you can feed 32 wide data into NEON/SSE, and block are always 32 bit aligned, you can have that working faster than UTF-8
The idea I heard was that to always code in 4 bytes blocks, and use some form of delta encoding. Some variations allow for less than OlogN character position search. And given that you can feed 32 wide data into NEON/SSE, and block are always 32 bit aligned, you can have that working faster than UTF-8