Hacker News new | past | comments | ask | show | jobs | submit login

Why can't you use lookup tables?



Part of the point of this paper is that the lookup tables for base64 fit into a 512 bit vector register.


Notably this technique only works in this special case (or other small LUTs). Generic (larger) lookup tables can't be vectorized yet.


Because table lookups don't vectorize. You could try to use vectorized gather (VPGATHERDD [0]), but so far in my experience it's been just as slow as using scalar code. (Then again, even gather instructions operate just on 32-bit data, so it wouldn't help anyways.)

So to perform multiple operations in parallel per core (SIMD), you'll have to write some code to perform the "lookup" transformation instead.

[0]: https://www.felixcloutier.com/x86/vpgatherdd:vpgatherqd




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: