Hacker News new | past | comments | ask | show | jobs | submit login

So a conclusion would be to use numpy instead of for loops for number crunching.

But behold! This does not scale as one could expect.

Turning a little more complex elementwise computation into vectorized numpy calls produces a lot of memory overhead. For each vectorized numpy call, which does a basic computation (like adding two operands, or negating, or sinus of, ...), you allocate a whole new array. The calls _can_ leverage parallel computation but you run into cache issues because of the massive memory overhead.

A simple C-loop computing each element completely can be much more memory local and cache friendly. Turning it into a simple chain of numpy operations can destroy that.

I think using Numba to compile ufunc for numpy can be a better approach to that problem.




> Turning a little more complex elementwise computation into vectorized numpy calls produces a lot of memory overhead. For each vectorized numpy call, which does a basic computation (like adding two operands, or negating, or sinus of, ...), you allocate a whole new array.

I believe this is no longer correct, within a single expression. In the last couple of years things were fixed so that `a = x+y+z`, for instance, will not create an intermediate array.


This is also what the 'numexpr' library does.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: