Julia covers your use cases and overwhelmingly fast.

semi-extrinsic · on Oct 29, 2020

My gripe with Julia etc. as replacements, is that Python is duct tape. I don't need fast duct tape, I need duct tape that is understood and used by essentially everyone I work with, and that has native, fast handling of large amounts of data (NumPy, Pandas). Good user experience as duct tape.

From my perspective Julia is sacrifising some amount of "duct tape UX" to gain speed, and that's the wrong direction.

Whenever we need more speed, we just pull the slow bits down into compiled languages, and scale them out to many cores with solutions like MPI.

R is another language that is mainly duct tape for stringing pieces of compiled code together. If it had a better UX for developers than Python, I think we would see it dominating much more today, without having any speed advantage.

disgruntledphd2 · on Oct 29, 2020

I agree with you right now, but think that Julia will end up dominating over the long-term, because dropping down into another language absolutely sucks for data scientists without engineering support.

Interestingly, R is probably a better UX for statisticians/data scientists than Python is (almost all the good parts of Numpy/Pandas were in R first), but it really suffers from not being well known by developers.

To be fair to R though, it's much, much easier to deploy than Python, which is a shocking indictment of the current Python packaging ecosystem.

throwaway894345 · on Oct 29, 2020

> Whenever we need more speed, we just pull the slow bits down into compiled languages, and scale them out to many cores with solutions like MPI.

This only works sometimes--for problems that allow you to do a relatively large amount of computation in the compiled language to justify the cost of marshalling Python data structures into native data structures. For matrices of scalar values, this works well. For many other problems (consider large graphs of arbitrarily-typed Python objects, or even a dataframe on which you need to invoke a Python callback on each element). If you rewrite a big enough piece of your Python codebase in the compiled language, then it will work, but now you're maintaining a significant C/C++/etc code base and the bindings and the build/packaging system that knows how to integrate the two on all of your target platforms. Python really doesn't have a good answer for these kinds of problems, and these are by far the more common case (though perhaps not more common in data science specifically).

rwallace · on Oct 29, 2020

> From my perspective Julia is sacrifising some amount of "duct tape UX" to gain speed, and that's the wrong direction.

What particular language features of Julia make that trade-off? (Not a rhetorical question, I'm not disagreeing with you, just curious; I'm familiar with Python, not really familiar with Julia.)

amkkma · on Oct 29, 2020

Probably the usual complaints about package load and compilation time.

These aren't fundamental issues with the language and will be solved by tiered compilation (there's already an interpreter mode, just have to integrate that with normal use), separate compilation and incremental sysimage creation which works with the package manager.

solidasparagus · on Oct 29, 2020

Possibly - I'm keeping my eye on it. But I can't stand the matlab-esque syntax.