I think the top-level take away here is not that Julia is a great language (although it is) and that they should use it for all the things (although that's not the worst idea), but that its design has hit on something that has made a major step forwards in terms of our ability to achieve code reuse. It is actually the case in Julia that you can take generic algorithms that were written by one person and custom types that were written by other people and just use them together efficiently and effectively. This majorly raises the table stakes for code reuse in programming languages. Language designers should not copy all the features of Julia, but they should at the very least understand why this works so well, and be able to accomplish this level of code reuse in future designs.
Interesting. Can you give an example of generic algorithms plus custom types, in practice? Off the top of my head I thought that any dynamic language or static language with good genetics would have this property, but maybe there's something that Julia does differently.
As an additional example, I really like the combination of unitful and diffeq [1]. But you're right that the core feature that allows this stuff is duck typing, but by itself it's not enough. Your notduck had to not only quack, but other animals have to look at it and act like it's a duck sometimes and like a notduck when you want it to do more than a duck. Multiple Dispatch (plus parametric subtyping) allows you to trivially define both notduck + A (notduck.+(A) in OOP languages) and A + notduck (extending A whatever A is) and it's really fast. That allows for the core Julia concept of specialization, easily customizing the particular behavior of any agent at any point to get both the common behavior right and the extended behavior.
For static languages you can implement part of it with, for example, interfaces (you'll face the same restrictions if the language is single dispatch), but even if you can extend the interface freely for already existing objects, there must be an agreement between the multiple packages to comply to the same interfaces (and you might either end with tons of interfaces since there are tons of possible behaviors for each entity and purpose or giant interfaces to fit all). In Julia you can use specialization to surgically smooth the integration between two packages that had no knowledge of each other and didn't even decide to comply to any (informal) interface (which do exist in Julia, like the Julia Array and Tables.jl interfaces).
DiffEq on ForwardDiff Dual numbers to calculate sensitivity via forward mode AD.
DiffEq on Tracker's TrackedArray to calculate sensitivity via reverse mode AD.
Measurements.jl's numbers that track measurement error input any algorithm to compute the transformed measurement error after the algorithm is applied.
NamedDims.jl + Flux.jl to give everything that PyTorch's awesome Named Tensors feature gives.
I really like Julia a lot and actually used it in a work project a few years back.
However, there's the debugger issue. There are several debugger alternatives. It's tough to figure out which debugger is canonical (or is any of them the canonical debugger?). The one that seems to being used most at this point is Debugger.jl. However, it's exceedingly slow if you're debugging sizeable operations (matrix multiplies, for example) - I'm talking hitting 'n' to step to the next line and then waiting several minutes for it to get there. There's also Rebugger.jl, MagneticReadHead.jl (IIRC) and Infiltrator.jl among others. I finally found that Infiltrator.jl was a lot faster for that machine learning program I was trying to debug, but it's rather limited in features (the only way to set breakpoints it seems is by editing your source, for example).
And this isn't the only case where there are multiple packages for achieving some task and you're not quite sure which one is the one that's the most usable. I think what the Julia community needs to do is maybe add some kind of rating system for packages so you can see which packages have the highest rating.
I wrote 2 of them.
MagneticReadHead.jl
and MixedModeDebugger.jl
MagneticReadhead is interesting as it is a purely compiled debugger via extensive source code transfroms.
Same magic that powers Zygote.jl for AutoDiff.
Also same general concept as is behind Jax.
It has huge compile time overhead, so is not practically usable.
In that way it is the opposite of Debugger.jl, rather than being slow at runtime it is very slow at JIT compiling.
MixedModeDebugger.jl is a proof of concept.
A small Source Code Transform to allow the debugger to run entirely compiled until it is going to do something, then it swaps to interpreted just like Debugger.jl
Early benchmarks are extremely promising.
But it's not really hardened enough for use.
1. Debugger.jl is A LOT smoother if you run it in compiled mode, which is a checkbox in the Juno interface. I've found that stepping to next line is instant in compiled mode, but takes forever without it.
2. Infiltrator.jl is great at what it's designed for, which is to dump you in a REPL deep within a call stack and let you see what's going on. But, Debugger in compiled mode also does this well.
> 1. Debugger.jl is A LOT smoother if you run it in compiled mode, which is a checkbox in the Juno interface. I've found that stepping to next line is instant in compiled mode, but takes forever without it.
Is there a way to do this if I'm not running Juno? I'd guess there must be some parameter that can be passed to @enter or @run?
It means that if you would have hit a breakpoint in code that is run in compiled mode, the breakpoint doesn't trigger—because it's being run normally at full speed without breakpoints, not being interpreted in the debugger (which knows about breakpoints).
I read this as in compiled mode breakpoints don't trigger at all. Is that correct?
Edit: Ok, I tried out compiled mode and it does stop at my breakpoint. The verbage in the documentation is a bit difficult to understand on this point. I'd guess you need to first set your breakpoints prior to going into compile mode?
There are parts of Julia I really like but it has some problems.
* Multiple dispatch is an odd design pattern that seems to over complicate things. I know there are people that love it and claim it’s better, but after working with it for some time I just want a struct with methods. It’s much easier to reason about.
* The packaging is cumbersome. I understand their reasoning behind it but in practice it’s just more of a pain than denoting functions as public one way or another.
* The tooling is poor. I work with Go for my day job and it’s toolchain is an absolute pleasure. The Julia toolchain isn’t in the same arena.
* The JIT is slowwww to boot. I was amazed the first time running a Julia program how slow it was. You could practically compile it faster.
* Editor support has never been great.
* As others have mentioned type checks don’t go deep enough
I think it has some neat ideas and in certain scientific arenas it will be useful, but IMO they need to focus a bit more on making it a better general purpose language.
While I don't agree with many of these points, I do agree that some of these can be substantially improved. We continue to work hard at it. Some are research problems, while others need elbow grease.
Just to present the other side, here's a recent thread on Julia discourse about why people love Julia. Many chiming in there are recent users of Julia and I think it is insightful.
Yea I really wanted to like Julia overall and many of the parts I like about it are on this thread. I think it's apparent we need a better numeric language than python, I just wish Julia would focus a bit more on utility.
I've been using Julia along with python and Pytorch, not yet for machine learning until flux is more mature but for NLP scripts, and I have to say that I'm starting to like it. Multiple dispatch, linear algebra and numpy built in, dynamic language but with optional types, user defined types, etc.
I hope Julia will be more popular in bioinformatics. Personally, I have a high hopes for BioJulia[1][2][3] and the amazing AI framework FluxML[4][5] + Turing.jl[6][7]. Apart from the speed, they offer some interesting concepts too - I recommend to check them out.
I will have to disagree. FluxML is indeed great, but it changes often and it does not support many of the advanced features of TensorFlow (neither is there a package that seamlessly works with Flux in order to support these features). It is getting there, but Tensorflow v2 is pretty great itself, and frequently faster. But FluxML might soon be as good or better.
Also, to be fair, FluxML is backed by a couple of people while Tensorflow is backed by megacorps, so it is already impressive how much they have done.
How much of the BioJulia stuff would you say currently works? It looks like a lot of repos have been created, and the scope is pretty impressive (looks like there are repos for everything from structural bioinformatics to population genetics), but a lot of them look to be basically empty(https://github.com/BioJulia/PopGen.jl), or have really scary looking issues:(e.g https://github.com/BioJulia/GeneticVariation.jl/issues/25).
% of the repos in the org on github? That number is lower than I'd like. % of the repos that are actively maintained? Much higher.
One of the great things about julia is that it's really easy to throw together a package and register it. One of the bad things about julia is how easy it is for those one-off projects or idea dumps to pollute the space. We could definitely do a better job labeling the repos that are no longer being maintained or that aren't actually ready for prime time. There's a tradition in julia of a lot of really functional libraries to stay < v1.0, because we all take semver seriously, and if the interface is still in a bit of flux, making the switch to 1.0 is a big deal (DataFrames.jl, looking at you). But it does make it hard for new users to distinguish between a super robust package and someone's weekend hobby.
Not my field, but at least some of it appears to be worked on seriously. This was an interesting recent blog post about making DNA-sequence processing go fast:
I've always found Julia a little lacking for bioinformatics, but I'm not doing ML. I have very high hopes for Nim, which I think has better performance potential than Julia across domains and can produce binaries.
Nim made me really sad when they gave up on multiple dispatch.
I has such potential as a language but I just can't imagine using one without multiple dispatch anymore
Julia is great. It’s significantly simpler than Python while also being much more expressive. It’s too bad the typing is only for dispatch, but hopefully someone will write a typechecker someday. I’ve found it surprisingly refreshing to not have to think about classes and just add functions on objects wherever I want. Some languages solve this with monkey patching (which is bad), others like Scala with an extension class (reasonable, but you still don’t get access to private properties), but the Julia approach is cleaner.
I wouldn’t use Julia for a non-scientific computing app as I don’t think it’s suitable, but for anything data science related, it’s great! And with the Python interop, I don’t really think there’s any reason not to use Julia for your next data science project. I suspect that over the next 5 years Python will no longer be used for these applications at all.
Python will still be used 20 years from now. The clear advantage of Python is the enormous ecosystem that is available, the millions of questions on SO giving solutions to every problem you can run into, the books and learning materials etc, programmers and corporations having invested loads of time and effort in building, maintaining and battle-testing libraries.
Don't get me wrong, i think Julia is an amazing language, but being an amazing language is neither necessary or sufficient to succeed. R shows how you can succeed just fine with a kinda weird language.
I remember when Python was a new and exciting language and people said the exact same thing about Perl. "You'll never see Python replace Perl for string processing, Perl has such a huge ecosystem!". At the time Perl was the de facto interpreted/"scripting" language.
Sure Perl is around still, but it has become a rather niche language used in a small number of specific communities.
I use Python everyday, and still haven't had enough time to properly learn Julia, and I would certainly not be shocked to see Julia take over the lion's share of numeric computing from Python in 20 years.
Python replaced Perl in many niches, because Python was better suited to general tasks. For string processing, it wasn't as convenient, but there were so many other things where it was clearly a better choice. And being able to use a single "good enough" language for everything is itself a major convenience.
Julia is the opposite - it's better than Python in one particular narrow niche, and cannot replace it broadly. So it has to offer enough to justify going from one language to two.
> Julia is the opposite - it's better than Python in one particular narrow niche, and cannot replace it broadly. So it has to offer enough to justify going from one language to two.
Absolutely disagree. I will claim Julia is a BETTER general purpose language than Python. I am not an academic or researcher. I use Julia over Python because I find it better is almost every regard apart from startup time ;-)
Julia outdo Python as a glue language. Integration with C/C++, Fortran, Python, R and the shell is awesome.
I have rewritten large bash shells as Julia code. Due to the need for having a language more broadly available on customer computers, I tried rewriting it in Python. I found that far more painful.
Package management is IMHO superior in Julia. REPL environment is superior. Expressiveness of language is superior. Nothing in Python matches the power of Julia macros. Multiple dispatch is insanely useful and dead easy to use. Python has nothing like it out of the box.
That you can more easily redefine types at runtime and full class hierarchies can offer some advantages at times. However I don't find that it makes up for the power of multiple dispatch and macros.
Don't make the mistake of thinking that Julia is only suitable for scientific work, because that is where it is currently used.
What truly makes the language general purpose isn't really language features. The vast majority of languages are "general purpose" by that metric - but you don't see people writing desktop apps in, say, R or PHP. It's the libraries. Python embraced "batteries included" a long time ago, and then there's everything on PyPI on top of that.
With a language like Julia, it's going to be a chicken-and-egg problem - so long as people are using it mostly for scientific computing, the stable and well-maintained libraries for it are mostly going to be about that.
I partly agree, but I think it is a bit more to it than that. Part of what made Perl, Python and Ruby so widely used compared to say LISP or Matlab is that they are great at text manipulation and interacting with the unix shell.
Julia is exactly the same. Anything people would use to write shell scripts, I would use Julia with instead and I have used many languages for this purpose: Python, Ruby and Go. Out of al these I find Julia to be the best language.
The fact that Julia is free, open source, available on all platforms and is good at text and shell stuff means it has a sort of Trojan horse ability to get into more wide usage.
I am not a scientific programmer, yet I use Julia for a whole bunch of stuff and find it very productive. Here are some examples of stuff I have written in Julia:
- Custom scripts for Swift and Objective-C app compilations.
- Various file conversion utilities.
- Code generators based on custom DSLs. Due to Julia's
support for macros writing a DSL is easier in Julia than in Python, Go or whatever "script" alternative I might have used.
- Various parsers. I find Julia very good at writing parsers. I have used this to create tools for manipulating user interface definition files for a larger C++ project.
- Editor plugins. I have written plugins in Julia for e.g. TextMate although I guess it would have worked on Sublime and other editors as well.
I cannot imagine anyone would have had a nice experience writing any of this stuff in R or Matlab.
Julia may never get big as a Web programming language, but for scientific programming, data science, data scraping/munging/preparation, scripting make up quite a large field together.
Julia may have some slow startup due to JITing but I found my Julia scripts to run much faster than the shell scripts they replaced and the Julia code was easier to read and faster to write.
Did you know that Java for instance, was originally designed to run on interactive TVs? But then, Java run on many type of devices and in broad range of use-cases (maybe with the exception of TVs :-)).
The original niche for which a programming language is designed may or may not be indicative for what it is used in the end. Julia strikes a very good balance between performance, flexibility (macros) and ease-of-use (default type inference). I won't be surprised if there are other niches outside of scientific computing where these characteristics are desirable.
The biggest obstacle I see for the adoption of Julia outside of the scientific computing realm are the latencies due to compilation. There are already 3-party solutions for this (like PackageCompiler{X,}.jl) but imho they are not robust yet enough for widespread adoption, but this might be about to change when different effort join forces [1].
"Julia is the opposite - it's better than Python in one particular narrow niche, and cannot replace it broadly. So it has to offer enough to justify going from one language to two."
How so? Julia is a fine general purpose language. I'd go so far as to say it's more appropriate than Python for a wide class of problems!
In many ways, Julia is a better python than python, if only for the lack of the (quite insane in the 2000s) program structure by text formatting. That is a misfeature.
Julia is fast out of the box. You don't need Julia + some-other-language to get performance. You can use your GPUs fairly trivially from within the language (though this is implemented as FFI, it still has an idiomatic feel to it).
As for the previous comments on Perl, I am reminded of Mark Twain's famous retort. Rumors of its demise are greatly exaggerated.
Program structure by text formatting is a misfeature, though. What you do need though to make sure code remains readable is strong convention. I think actually that's the most unique feature of Julia (as also alluded to in the post) - how much of julia's ecosystem that works because of social convention (and conversation).
Purely anecdotal evidence, but Julia's support for basic HTTPS server capabilities is immature at best. Sure, it is capable, but the amount of effort to build an web API is much higher than in a language like Python which has had many more years to mature its ecosystem.
That's not to say I dislike Julia. I think it's a wonderful language for numerical computing. Over the next few years I certainly hope to see wider adoption among data science and analytics communities.
Yes, some things succeed. Most things don't succeed. If you bet on failure 100% of the time, you'll have your misses, but you'll hit more than you miss.
Python's ecosystem is great - but Julia's is growing incredibly fast, and in some cases Julia has already surpassed what is available in other languages (for example, take a look at the whole differential equations ecosystem: https://github.com/JuliaDiffEq).
Also, Python's ecosystem is only a 'pyimport(name)' away (using the PyCall.jl package). Same thing is true for R and a number of other languages (RCall.jl, JavaCall.jl, etc.)
I've been using SymPy, QisKit, matplotlib and other Python packages with no problem in Julia.
I would be careful treating this too much like a popularity contest. Python will still be used 20 years from now, but that doesn't necessarily mean that it's going to be dominant 20 years from now.
Source: I used to do a lot of programming in various dialects of Basic, which, a bit over 20 years ago, was popular largely because of its own ubiquity and popularity. And, while I still maintain some Basic code, I was surprised how quickly it died. One day everything was being written in it. The next day, we were were writing new things in a fancy new language that everyone agreed was more productive, and talking to the existing stuff through an FFI. And, a day later, we were replacing modules in order to get them off of the "legacy" platform.
>The clear advantage of Python is the enormous ecosystem that is available, the millions of questions on SO giving solutions to every problem you can run into
All of this is also available in Julia via its near-seamless Python interop. Its package manager even does a better job of managing Python packages than Python does.
> The clear advantage of Python is the enormous ecosystem that is available
Could have said that about Perl back in the day too, or a great number of languages.
The key problem Python faces competing against Julia in the long run is that Julia runs faster with less resources. By that I mean that because packages in Julia combine easier and are written all in Julia, it is simply much faster to develop equivalent functionality in Julia compared to Python.
Thus as Julia grows the speed of advancement will just keep growing. There is also a sort of asymptotic curve for most software. As you reach certain complexity advancing gets slower. Python due to its age has acquired a lot of cruft which will slow development down.
I know exactly how this feels from having worked on very similar software products of different age. The older software really had problems keeping up speed. The younger software moved ahead faster due to cleaner design. We needed far less people to add more features than the competition.
Python will struggle with old design decisions it can no longer undo. Look at e.g. the enormous amount of man hours required to get JIT compilation working in Python. It still does not work well. Meanwhile Julia has require less manpower making a whole language with better JIT compilation.
Language design matters over time. I don't claim Julia will overtake Python any time soon. But over a long time frame I think it is inevitable, because legacy goes against Python in too many areas.
> R shows how you can succeed just fine with a kinda weird language.
One thing that most people don't appreciate about R is the subtle influence from lisp world. This makes it really feel like the language is optimized for data science down to the most basic syntax level.
I believe that R was originally an implementation of S in Scheme, because of the restrictive commercial licensing of S and the desire to have a free-as-in-beer alternative that was S-compatible. This comes through in things like R's metaprogramming:
Programming languages from 2000 remain, though. C, C++, Java, JS, Python, etc. Even Fortran, Cobol and Lisp remain in use. There’s been attempts since the 80s to popularize visual and and higher level approaches to programming, but the traditional languages still dominate. And the newer ones like Go, Elixir and Julia are like the traditional C, Lisp and Fortrans.
Python will still be around, but how popular it'll be is an open question. For example, Pascal or Perl were huge in their prime, and while they continue to stick around, the torch has been passed on.
Sure, but if Julia relegates Python to a niche and Rust does the same thing to C, they're still the same kind of approach to programming, and not some high level, mostly automated thing. They don't fundamentally change how people program.
Sure, some do, but past adoption is not a guarantee of future adoption. 20 years in the future is a long time. Just look at Flash. Who knows, maybe a Quantum OS will dominate that doesn't support Python.
So yes, if someone will still be using Python in 20 years, I'm sure you're right. But Python just as easily could be relegated to a niche domain while other languages take over a broader range of applications.
Especially if someone's able to build a programming translator where you can easily port your code base from one language to another.
>Sure, some do, but past adoption is not a guarantee of future adoption.
Not a guarantee but still the best predictor.
>20 years in the future is a long time. Just look at Flash.
Flash had a lot of things against it. Proprietary (aside from niche implementations nobody really cared about or used), single vendor, not owing its platform (browser vendors owned it), and not really that useful (aside from casual online games its main other use was small animations and intro pages).
And even then, it also took the combined effort of 3 browser vendors to kill it, and its public ousting from its mobile platform from the biggest company on Earth...
I'm more responding to the idea that programming might become simple and automated, and we can't even predict technological changes a year or two out. You're right that Python could be relegated to a niche language in 20 years. That does happen.
What hasn't happened is a fundamental change to programming languages since the 1960s, despite the incredible increase in computing power, ubiquity of computers and much better tooling and environments for programming languages. There's no evidence that this is going to change anytime soon. All the popular new languages are similar to the popular older languages. If people don't want to program in JS, they transpile from Typescript. Even web assembly is a means to use a language like Rust on the web.
There's no simple, automated PL on the horizon that is going to replace JS, Java, Python, etc. There isn't one for spreadsheets, either, which is a technology from the 80s.
> What hasn't happened is a fundamental change to programming languages since the 1960s
I think the rise of HLLs is an example of a fundamental change. It was still reasonable to be hand-writing assembler for lots of applications in the '60s.
Sure, but HLLs are decades old and ubiquitous by the 80s. There's no reason to think PLs are going to fundamentally change in 20 years. They could, but it's just speculation at this point.
The argument over whether C was fast enough to obviate the need to write in assembly raged across the pages of Dr Dobb's Journal and Byte Magazine well into the mid '80s.
Which while true for home micros, was already a proven fact since the early 60's in the big warehouse mainframes, with the Algol derived systems programming languages.
And that supports the argument that PL evolution is slow and unlikely to fundamentally change in 20 years time. Anyway, while people where arguing over C vs Assembly, there were Lisp and Smalltalk machines that didn't become the future.
It won't; that is a naive view. Innovation can't keep accelerating forever, and I would be surprised if it even keeps up its current pace; far likelier it's a sigmoid curve - https://en.wikipedia.org/wiki/Sigmoid_function
I don't know if you're talking about like the heat death of the universe, and so like nothing is "forever", but I don't see signs of programming language innovation slowing down at all.
> the millions of questions on SO giving solutions to every problem you can run into
I definitely know what you mean. I too have the experience of typing in basically natural language queries into Google and have stack overflow turn it into copy-paste-able code for me.
But what's also true is that in 20 years the problems being solved will likely be bigger, and more complicated, and in many cases we'll be solving them on platforms that require good abstractions over multiple cores. So none of that is really in Python's wheelhouse. (Complexity is an issue because abstrations in Python almost always come at a cost.)
I've been using Julia for non-scientific computing programs for almost 5 years now, and (especially now that it is stable since the 1.0 release) have found it well suited for general programming as well.
Having a language that is easy to write (like Python), runs fast (like C/C++), and incredibly flexible & expressive (like Lisp) makes programming fun again!
* Container images are generally huge. Layer caching helps to the extent that layers are reusable. If you're lucky, you can use something like alpine for smaller huge images. If you write in a language that supports statically linked executables (e.g., Go) then you can get by with a scratch in many cases, but then you could more easily ship a naked executable
* Container images take a long time to build and build caching only works well if your dependency tree is linear (it's not). (Caveat: there are niche alternatives to docker build for which this may not hold, but then you're subject to the other issues with niche tools--support, docs, compat, etc)
* Your deployment target has to have a container runtime compatible with your image format. This is a PITA for end users in many cases.
* Most popular container runtimes make container processes children of the daemon process, not children of the client process. This means killing the parent process will not kill the child process, which means CI environments need to handle all sorts of edge cases. For example, a CI job is waiting on a stuck container to terminate, but eventually times out or the job is killed or similar; each time this happens a container is leaked--this means that CI tools have to support the container runtime explicitly and handle all of these edge cases because it's prohibitively complex for users to manage.
IMO the only advantage (and this is a huge advantage where applicable) for containers is that they can be orchestrated by things like Kubernetes or AWS Fargate.
> It’s too bad the typing is only for dispatch, but hopefully someone will write a typechecker someday
I was under the impression that if you write a function that takes very specific sub-types as its parameters, and then try to call it with invalid parameters (that is, values that cannot be converted to the right type via the promotion rules) that Julia will complain about this?
Ah, right! We're talking about AOT typechecking. I had forgotten Julia doesn't really do that (yet). Yes, fair is fair, since that is when it is the most useful its absence is a bit painful.
I didn’t express that correctly. What I meant to say was that I investigated Julia as a possible “use it for everything language.”
For me, general purpose has to include deep learning tools.
I really prefer Lisp and Haskell, but although Haskell’s TensorFlow support is pretty good, I found it easier to just use Python. I have had problems with the SBCL Common Lisp TensorFlow C bindings, but perhaps that was my fault.
AFAIK Clojurians doing some interesting stuff in that space. They've figured out Python Interop, writing books on Deep learning, Linear Algebra, etc. all that in Clojure.
A little of both, I guess. The biggest issue is that I, personally, get uncomfortable with dynamic typing when the project starts becoming over 1-5 Kloc. So I tend to stick with stuff like Scala, Rust, or OCaml for non data science tasks.
What part of Julia isn't lispy enough that you don't consider it a lisp? I'm not saying you're wrong, just because there a scheme in there, just curious.
Yes - I understand why some people are keen on static types, but type inference is a powerful way to create funcetionality and I believe that the separation of concerns between the dispatcher and the programming code (removing the logic of decisioning from the code and using type inference instead) leads to clearer programs. I can't think how you can have that and have static checking as well (although I know that there are some compilers for some Julia code as mentioned in this thread)
Lots of languages with static typing use type inference. Haskell, OCaml, Go, Rust, Nim, etc. Even C++ has auto these days which I think when combined with templates gives you what you're talking about though it is a bit clunky.
EDIT. For instance in Haskell I can just declar a function to sum the things in a container by saying
summer a = folder (+) 0 a
And the compiler will figure out that the function accepts a Foldable container of some sort full of things that are Nums without ever having to tell the compiler that (though it's a good idea for maintainability. And then I can just pass in a list of floats and the compiler will Do The Right Thing. And if I pass in a list of booleans instead the compiler will yell at me at compile time because there's no (+) operation for booleans.
I think he possibly phrased it wrong. Dynamic languages don't do type inference, they carry type with each value at runtime.
Dynamic and statically typed languages each have their own wonderful advantages. There is just no way of replicating all the advantages of dynamic typing in a statically typed language.
It is hard to sum up why in a short sentence but I have tried to articulate it in a longer article here:
Keep in mind this focuses only on advantages of dynamic languages. It does not mean I don't consider that static languages don't have their own unique advantages, but that was not the topic of my article.
The primary problem with static typing as I see it is the complexity and mental overhead it adds. It is not just the complexity in the type system and the language itself, but also in the tooling surrounding it: build system, dependency management, binary interface definitions, debuggers, REPL etc.
The complexity makes meta programming very hard to do for a software developer of average intelligence.
Where I see a use for advance static languages is for very specialized systems which need a high level of correctness and where a company is able to do the job with a small team of very bright people.
For me static typing removes mental overhead. I have fewer things to keep in my head, I can trust the tools to have my back and point me to where I did stupid mistakes. It makes reasoning about the code more local.
Also you can have static typing in an interpreted, interactive setting, with any language that has a REPL. That's even how the ML family started.
Peace of mind is not quite the same as what I meant with mental overhead. What I am referring to is that static type systems have far more complex type systems and semantics, which occupy more brain real estate.
And complexity tends to explode because it carries over to the tools. Static languages require far more complex tools to work with.
I did not mean static languages cannot have REPLs but my experience with them have not been great. They seem hamstrung in a way I cannot articulate well.
Reading through that Medium post makes me think that maybe the author has never worked with a large code base written in a dynamic language and is attributing the difficulties of large projects to them being written in static languages? My day job involves a large amount of Python and we certainly use CMake to bundle it up for deployment, use YAML configuration files, worry about library version compatibility, and all of the other problems the author ascribes to static typing.
Yes... but I feel that is more like an implementation detail. Semantically speaking you don't think of Julia as doing type inference.
Semantically speaking expressions don't have types and the type of expressions are never inferred (however they are inferred at compile time as an implementation thing for the JIT).
My point is that you can make an alternative Julia implementation that did absolutely no type inference and everything would work just fine apart from possibly bad performance.
However you could not make a compiler for a statically typed language such as Haskell which did not do type inference. If you did programs would no longer compile.
At least that is how I interpret the difference. I could be wrong. You probably know this better than me.
Presumably this means that no one has implemented a suitable static type checker or its type annotations don't provide sufficient information for a static type checker to work in principle?
The problem is more the former (nobody has gotten around to it yet), but also a bit of the opposite of the latter, i.e. the type system is so rich, type-checking may be undecidable in too many contexts.
Julia's parametric types are incredibly rich. You can look up 'dependant types' which are similar in the static world and see the sorts of hoops static language designers jump through to have dependant types and still be able to reliably prove theorems.
What you can do with Julia types is far more advance than I think you can do with any statically typed language.
I can add a type annotation in Julia which is a function call which returns a type e.g. This function can be of any complexity.
I think a better approach is simply a Linter. You report type problems you can find and ignore the rest. Not point trying to figure out the more advance Julia cases.
I would assume 90% of Julia code will use types in a pretty straightforward manner and thus would be able to be analyzed by some sort of Linter. Lint.jl e.g.
You're describing optional type systems; in this regard, Julia is on par with Python or TypeScript. The linter you're describing is also a type checker (like Python's Mypy). That said, it would be really interesting if someone would make a type checker that would evaluate/resolve those more complex types, at least those that can be guaranteed to terminate.
I wrote an article showing a (probably naive) way how it could be done, but also discussing some of the limitations. At the end I think it would be better to think about these kind of tool as partial, automated tests rather than full-blown static type checkers.
You can do some form of type checking with stuff like Lint.jl. I mean since Julia uses a bunch of type information and Julia parses code into data structures which are easy to process, you can to a lot of your own type verification before running a program.
However I see Julia not being a statically typed language as a feature. If Julia was statically typed you would pretty much have lost everything that makes Julia a great language.
You cannot do multiple dispatch with a statically typed language efficiently.
Trashing is a little harsh, but Julia devs probably were mostly Python devs back in the day and are intimately familiar with the inadequacies of the language: the work required when you had to drop into C, the bad syntax for math, the constant conversions between ndarray, array, and lists, etc etc.
At my old bioinformatics lab, Python literally wasted thousands and thousands of collective man-hours which would have been saved by Julia if it had existed at the time. Since a lot of these researchers couldn’t really program that well, they would write code that would literally take weeks to run. And then they would call me and I would rewrite it in C and it would then take an hour or two. Julia solves this problem.
The amount of unnecessary supercomputer time (and electricity) that our lab (and others) were wasting with Python was, honestly, disgusting.
Exactly... well I didn’t do supercomputing but still wasted a lot of time dealing with Python 2/3 issues among others. Then there’s the ability to do things like fast bootstrapping, Monte-Carlo and other handy statistical techniques without writing kernel functions in C. Python was great for it’s time and enabled a lot and still let’s lots of people do great things, it’s just not for me anymore.
I haven't seen that at all - many Julia programmers are (or were) also Python programmers.
I think there is a lot of respect in the Julia community for Python & the Python ecosystem.
There have even been a number of Julia talks at various PyCons over the past few years.
Hrm ... I recall Pythonistas trashing Perl as line noise, and other languages as well, for many many years. I'm not saying its right, just that karma, sometimes, comes back into focus.
Julia is a language I really wanted to like, and to a certain extent I do. However after spending some time working with it and hanging out on the Discourse channels (they certainly have a very friendly and open community, which I think is a big plus), I've come to the tentative conclusion that its application domains are going to be more limited than I would have hoped.
This article hits on some of the issues that the community tends to see as advantages but that I think will prove limiting in the long run.
> Missing features like:
> Weak conventions about namespace pollution
> Never got around to making it easy to use local modules, outside of packages
> A type system that can’t be used to check correctness
These are some of my biggest gripes about Julia, especially the last two. To these I would add:
* Lack of support for formally defined interfaces.
* Lack of support for implementation inheritance.
Together with Julia's many strengths I think these design choices and community philosophy lead to a language that is very good for small scale and experimental work but will have major issues scaling to very complex systems development projects and will be ill-suited to mission critical applications.
In short I think Julia may be a great language for prototyping an object detection algorithm, but I wouldn't want to use it to develop the control system for a self-driving car.
Unfortunately this means that Julia probably isn't really going to solve the "2 language problem" because in most cases you're still going to need to rewrite your prototypes in a different language just like you would previously in going from, for example, a Matlab prototype to a C++ system in production.
You touch upon some interesting pain points. I really like Julia and working with it is a pleasure.
Except the Module system, which feels unnecessarily arcane. I'm happy to be educated on why, but it seems to successfully combine the awkwardness of C-style #include with the mess of a free-form module system. The end result is a Frankenstein monster where technically everything is possible, everything could be included anywhere, there are no boundaries or even conventions. It makes for a frustrating experience for a newbie.
Say you have a package, and inside is a file called `xyz.jl`. You open the file and it defines a module called Xyz. But this tells you absolutely nothing about where in the package Xyz will appear. It could be included somewhere deep in the hierarchy, or it could be a submodule. It could be included multiple times in multiple places! That's bad design for sure, but the language places no boundaries on you. You open another file `abc.jl`, and see no modules at all, just a bunch of functions, which in turn call other functions that are defined God knows where. A julia file does not have to contain any information about where the symbols it's using come from, since it will be just pasted in verbatim to some location somewhere.
The whole module system feels like one big spaghetti of spooky action at a distance.
It's a shame too, because the rest of the language is very neat. Once one gets over the hurdle of the modules, it is possible to establish conventions to bring some sanity in there, but it's a hurdle that many people will probably not want to deal with.
It seems great to me that paths & source files are mostly irrelevant, you're free to re-organise without changing anything. And that `using Xyz` is always talking to the package manager. You can make sub-modules and say `using .Xyz`, but there's very little need to do so, and few packages do.
You can shoot yourself in the foot by including source twice, as you can by generating it with a macro, or simply copy-pasting wrong.
I mean, I'd like them to work like Python, Ruby and TypeScript, but you're right to say I can't describe why I want this.
Is there some guide I could read about structuring a large Julia project? It was pretty easy to intuit with Python, wherein I would put related files in a folder. But with Julia, everything is everywhere and I'm baffled.
I think that `/src/Name.jl` must have the main module, and `/test/runtests,jl` tests. And the package manager cares about `Project.toml`. But beyond this there are no real rules enforced, although there really seems to be one way to do things.
Here's a much bigger project, organised the same way. `include(file.jl)` literally copies in the text, and it's somewhat conventional to collect all imports & exports in the main file:
Still no sub-modules. No files included in mysterious locations. Methods being defined for functions from elsewhere are all qualified, like `Base.show(io::IO, t::OptimizationState) = ...`
> But with Julia, everything is everywhere and I'm baffled.
This is exactly it. Julia allows you to import and include anything, anywhere. You open a file and it doesn't say anything about where the dependencies are coming from and where this particular piece of code will go. Both of those are defined at the place where this file is included, which itself could be anywhere. It could be a different directory, different file, tucked away in a module. It could be in a dozen other files, or no files at all, and you can't tell from looking at just the source of the file.
Languages like Python, Rust, C# or even Java have module systems that I find are more restrictive, but much easier to follow. You always have the pertinent information at hand. Each file containing code clearly tells you two crucial pieces of information:
1. Where the code fits in the greater picture
2. Where the dependencies of the code in a file come from
Python, whose module story is actually pretty poor, is still easier to follow than Julia, because it just matches the file/directory structure. You can reason about the hierarchy of a python library by just navigating the directories. In a normal python project, each file is one module and it's dependencies are clearly specified as imports.
Rust relies on the file system as well, and much better defined rules than Python. I find this great, because the file system is already hierarchical and we are used to the way it works. When I open a file in a Rust project, I know immediately where it fits in the hierarchy - because it is implied from the file system. Rust gives you a bit more flexibility in that you can define submodules in each file.
C# & Java qualify the namespaces fully in each file. While the file structure is not as clear anymore at the file system level, a single file contains all the information necessary to determine where the code fits in and where it's dependencies come from.
Now let's take Julia:
A single module will be often split across multiple files. Since they share a single namespace, the imports happen at the top level where the module is defined and includes all it's source files. When you open a source file, you have zero information where all the functions and data types are coming from (or where they are going for that matter).
I see the following pattern systematically emerge in Julia code:
- A function is defined in file A
- File A is `included` in file B, where it forms part of module X
- It is then imported into module Y in file C, but it is not actually used there
- As it is finally used in file D, which is `included` in module Y in file C itself
The problem is that there is no link from file A to file B, or module X for that matter. File A could be part of a dozen modules, or zero. Neither is there a link between the usage of the function in file D and where it is coming from. You actually have to find all the places where file D is included, and then check what flavor of the function does each location import. The relationships are established at some indeterminate level in the hierarchy.
Again, don't get me wrong, this is just a wart on an otherwise very pleasant language. I wouldn't be complaining if I weren't using it.
That was helpful, especially realizing that part of the problem (at least as you see it) is that the "linking" is unidirectional.
Function definitions/calls are also unidirectionally linked. You can see at the call site which function is called, but you can't see at the function definition the references. But unlike a function, which might be called from many places, it really should be the case that a file is `include`d exactly once.
Author of post here:
there were a major gripe for me starting out too.
It took me a fair while to conclude that they allowed to useful things in the bigger picture.
and it certainly is not a pure win.
I do miss static typing.
I would question the claim it doesn't scale to production.
I know people who have build hugely complex production systems in perl that are still running today 20 years later.
Further, I myself work on what we believe to be the largest closed source julia code base, in terms of number of contributors, number of packages and total size. (Its also pretty large in general, though i have yet to work out how it stacks up against DiffEq-verse).
And I have seen thing go from research prototype into running in production.
It works.
I am not going to deny though there are advantages to other languages.
There are many trade-offs in the world
Perl's just another language I've known. Maintaining a 20yo codebase in a language that people find distasteful sounds like comfortable job security, to me.
Most programmers want to build new things, not work on maintenance projects, regardless of tech stack.
That said, many people do enjoy that sort of work and finding them is not unusually difficult unless the project is so big that you need dozens of bodies.
People write large-scale systems in dynamically-typed languages all the time. Multiple dispatch and macros make clean scaling easier than it would be in most other dynamic languages. Its competitors in numerical performance are C/C++ and Fortran, which are both minefields (C much more so). Julia is definitely safer in practice than these kind of languages with weak, unsafe type systems.
I'm not saying static types don't have benefits as well, but it would also be very against the design goals as a Matlab/R competitor.
Inheritance would also directly clash and overlap with multiple dispatch, which is strictly more powerful.
> I'm not saying static types don't have benefits as well,
It's funny; formerly I was a die-hard fan of static typing, but, lately, my opinions have become more nuanced. Something more along the lines of, "Yeah, I'd never want to just remove the type annotations from my Java code and then try to maintain the result, but some dynamic languages allow me to have a fraction as much code to maintain in the first place."
I'm also beginning to wonder if my feelings about dynamic languages have been unduly influenced by some particularly popular, and also particularly undisciplined, dynamic languages. JavaScript and PHP, for example.
> It's funny; formerly I was a die-hard fan of static typing, but, lately, my opinions have become more nuanced. Something more along the lines of, "Yeah, I'd never want to just remove the type annotations from my Java code
That might be one of the issues. Even when it comes to static typing java is high investment low rewards.
Things are admittedly less bad than they were 15 years ago, but it remains that java has very high overhead both syntactically and at runtime and yet is pretty limited in its static expressivity. So when you do away with java for a dynamically typed langage most of what you lost from missing static types you gain back from having so much less LOCs and ceremony and architectural astronautics to deal with.
In theory you could probably write terser java but that’s not what the community does or what the ecosystem encourage, so if you do you’re on your own and then why keep using java? There’s plenty of better langages with no community and no ecosystem out there. And they have way, way better type systems.
> Even when it comes to static typing java is high investment low rewards.
My sense is that it's because Java isn't really all that static a language. In terms of the syntax, sure. But it also relies very heavily on run-time reflection, run-time type casting, and run-time type checking. Since the introduction of generics, there are even some situations where the types are known and declared statically in the source code, but the compiler is unable to do static type verification, and so the type checking only happens dynamically at run time.
Meaning that you kind of get the worst of both worlds: High-ceremony programming, but not a whole lot of help from the compiler in return for it.
Anyway, yeah, that does mean that the debate does get more hair-splitty if we're talking Haskell vs. Racket. But, realistically, that's not usually the languages people have in mind when they're talking static vs dynamic - it's much more likely to be Java or Python. Both of which, like most languages, fall short of being the ideal examples of their respective type checking approaches.
> My sense is that it's because Java isn't really all that static a language.
All the stuff that happens at runtime explains the "low rewards" part, but it doesn't explain the "high investment", Java is also an extremely verbose language, doing anything is verbose and the one thing you'd want to be doing (create and use new types) is one of the most verbose parts of the language.
Records will eventually make things less bad, but a "newtype pattern" in Java takes a dozen lines or five, that's horrendous for something which should take maybe two lines. And that's before we even consider the insanity of the "one (public) class per file" mandate.
> But, realistically, that's not usually the languages people have in mind when they're talking static vs dynamic - it's much more likely to be Java or Python. Both of which, like most languages, fall short of being the ideal examples of their respective type checking approaches.
There are lots of criticisms you can leverage at Python, but it's nowhere near as bad a representative of dynamically typed language as Java is statically typed ones.
> multiple dispatch, which is strictly more powerful.
In a language that supports classes I can have class B inherit from class A and automatically provide all of class A's functionality without adding a single extra line of code. I can extend class B's functionality by adding only code that is specific to it.
I don't see how to do that with multiple dispatch, at least the way it's implemented in Julia.
>In a language that supports classes I can have class B inherit from class A and automatically provide all of class A's functionality without adding a single extra line of code.
And people will abuse this to create horrible brittle inheritance hierachies. There's a reason modern languages like Go and Rust deliberately don't support implementation inheritance; to many people it's an antifeature.
The thing about software is that the "one true way" changes every 2 or 3 years.
A few decades ago inheritance was considered a key software design principle, then it was abused by many (especially Java programmers, I think), just like any other powerful feature, now it's considered evil. If multiple dispatch becomes as popular as classes/inheritance was, I suspect it will go through the same love/hate cycle.
All I know is that I have used implementation inheritance to good effect on multiple occasions and found it to be a very valuable feature.
Inheritance is not really all that powerful, assuming that one follows SOLID and especially Liskov substitution. There's very few places where it's actually applicable. Interfaces are much more widely applicable -- defining contracts instead of behaviour makes it much easier to follow SOLID.
I mostly only use inheritance for patching the behaviour of some code I don't own by overriding some specific method. Because that's the only mechanism the language provides to perform such a modification. This is basically the use-case that Julia's multiple dispatch addresses.
It's important to always remember the context that SOLID is a collection of one man's opinions.
When we mix classes to create a new class, the ingredients in the mixture retain their adherence to their respective contracts, and we didn't have to re-implement them.
> I mostly only use inheritance for patching the behaviour of some code.
That's good for you, but you have to realize that classes specifically designed as inheritance bases are also a thing and the experience of using those kinds of classes with inheritance is not the same as deriving from any random class whose behavior we don't like in some aspect.
The conventions by which a class supports inheritance are also a form of contract! The contract says that if you derive from me, you can expect certain behaviors, and also have certain responsibilities regarding what to implement and how, and what not to depend on or not to do and such.
If a class is not designed for inheritance, then there is no contract, beyond some superficial expectations that come from the OOP system, some of which can fall victim to hostilities in the way the class is implemented, or maintained in the future.
It's interesting to think about what multiple dispatch abuse would be.
As far as I know, implementation inheritance can be misapplied when you try too hard to use the language's type system to capture some domain-specific model. e.g. you're dealing with multiple kinds of entities, and they all have a `name`, so you decide that all of the classes should inherit from `class Named`, because they have that in common with each other.
Well, there's lots of different taxonomies possible for your entities, probably. And single implementation inheritance lets you express just one taxonomy. Interfaces and multiple inheritance aim to allow multiple co-existing taxonomies, but I can't really explain the ways in which its misapplied.
I agree that there are some clear situations for implementation inheritance, so it seems like a useful pattern to be able to support. I think the discussion is whether it should be a key design principle, and built into the language design, or if the language design should be more general, so that implementation inheritance can be built on top of that.
In TXR Lisp, I doubled down on OOP and extended the object system to support multiple inheritance, just this December.
Real inheritance of code and data allows for mixin programming, which is very useful.
If you don't give OOP programmers the tools for mixing programming, they will find some other way to do it, such as some unattractive design pattern that fills their code with boilerplate, secondary objects, and unnecessary additional run-time switching and whatnot, all of which can be as brittle as any inheritance hierarchy, if not more.
Single-dispatch OOP dispatches (implicitly) only on the first argument, self. That is how Class B can provide the functionality of A. Multiple dispatch can dispatch on all arguments. Thus, OOP single dispatch is a special case of multiple dispatch.
In your example, functionality in class B can be written simply as ordinary generic functions, which are inferred to their most general compatible type without annotations. Because of this, all functionality for class A will work for class B, as long as B is a superset of A. Not a single line of code, and no brittle inheritance hierarchies.
Multiple dispatch is one solution to the expression problem, which both OOP and functional programming suffer from in dual ways. Other ones include Haskell's typeclasses and Ruby's mixins.
Yes, but in Julia the problem is that inheritance isn't allowed from concrete types. So this example in Julia would mean that A is an abstract type whose objects can contain no data. So you have to figure out a way to implement functionality on A without having any data to work with. There are several ways to work around this but they all involve considerably more code complexity than would be required in a language like C++ that supports classes.
It’s maybe less of a problem than you think. If you simply want to call a method on two different types you can often just do it. Type annotations generally don’t help with performance, and any type checking is only going to happen at run time anyway.
Inheritance does help with documenting what functions are compatible with what types. I think that could be better in Julia, and things like abstract type hierarchies and traits help a bit. Concrete inheritance could be nice, but that seems to also enable some pretty bad OOP practices.
Perhaps. The feeling I get with Julia is that the devs are sort of making it up as they go along. I don't mean just making up the language (which of course they're doing), but making up whole new approaches to programming that aren't necessarily well-understood and tested in the real world and certainly aren't very familiar to most programmers.
Maybe in the end they will succeed and will invent a generally superior approach to software development. But for the moment it all feels very experimental and so not a language I'd want to commit to at present if I were starting a major, non-solo project.
Yes, but delegation requires a lot of extra code to get what you have for free in a class based language. And yes I know you can probably write macros to handle much of that but having to use meta-programming to get what other languages give you for free doesn't seem ideal to me.
If using meta-programming to get "basic" features rubs you the wrong way, the language might just not be for you. In general, the Julia philosophy (as well as the lisp philosophy from which Julia descends) is that things which can be done efficiently using macros instead of being built-in should be done with macros. Language features are only for things that cannot be accomplished via metaprogramming. As far as actual implementations of method-forwarding macros, there's an implementation in Lazy.jl[1], and TypedDelegation.jl[2].
To be honest, I've never found the need for delegating a huge number of methods in my own work, but then I've never wanted to add a feature or field to something as complex and featureful as e.g. a DataFrame. What issues have you run into when using method-forwarding/delegation macros?
The nice thing about the Julia ecosystem is that people tend to be pretty willing to step back and define their methods in terms of an interface (Tables.jl in this case) which then allows code reuse without brittle delegation. Having to put so much effort into changing the structure of upstream doesn't seem ideal from a composability perspective though.
I don't understand type systems very well. When you say "check correctness" do you mean something beyond static linting with type-hints, like in Python? Or do you mean something deeper, like in functional languages like Elm and F#?
Also, is there always a trade-off between types and flexible meta-programming? Like, OCaml has meta-programming capabilities, but they make type-checking way harder, according to my PhD friend who's written extensively in Scheme and OCaml.
Yep this has been my conclusion as well, I really wanted to like Julia and there are parts of it I do, but I think it misses the mark in some big ways.
What does this have to do with garbage collection?
If your concern is about latency in real time systems, Garbage collection isn't the problem, heap allocating memory is. Julia makes it easy to never allocate anything on the heap (unlike most garbage collected langauges).
I think the above poster is referring to languages like Go/Java/C#/Nim/whatever. Like Julia, these languages have a significant runtime. Unlike Julia, their apps can be compiled AOT into standalone executables.
I equate garbage collection with larger static binaries. Julia's static binaries are especially large since they package the entire sysimage. I'm excited about StaticCompiler.jl because, by cutting out the sysimage, it promises to make Julia more competitive with other garbage-collected languages like Nim, Go, Crystal, etc., all of which produce relatively small static binaries.
However, like these other garbage-collected languages, Julia is unlikely to ever compete with the likes of Zig, Rust, C++, etc. which produce even smaller static binaries since they don't have to ship a GC runtime. That's why I specified garbage-collected languages as, in my opinion, the addition of proper static compilation will allow Julia to completely supersede other garbage-collected general programming languages but not manually managed systems programming languages.
I think with StaticCompiler.jl, if you create a binary where the compiled code knows there won't be any GC, for instance, say you only ever work with Tuples and floating point numbers, I don't think there will be any GC runtime in the compiled binary.
Julia can stack allocate all sorts of values. I think it currently stack allocates all immutable values and also every mutable value that it proves do not escape their function.
Kind of besides the point, though, because the GC is probably not heavy anyway. Lots of fear of GC is not based on benchmarks.
Julia doesn't make it easy to guarantee that you won't allocate anything on the heap. The semantics of the language don't include really any control over that, yet. I hope one day it does. It seems like a tough problem, because the allocation strategy can depend on the results of inference and compilation, which is another thing that is hard to control since the language try hard to have the semantics not depend on inference.
Of course, you can measure the allocations, and you can see as a matter of fact whether some code allocates, but I would say that it's hard to predict what allocates and what doesn't, and that this is still changing (e.g. https://github.com/JuliaLang/julia/pull/32448)
Look under Static Julia Compiler here [1]. I've successfully done it in Linux, don't know about Windows. It doesn't work when you have binary dependencies like GTK, etc.
Julian interfaces are less formal than swift protocols. We use sub-typing and/or traits together with multiple dispatch to define generic pluggable interfaces.
There’s a lot of discussion around making a more formal protocol-like system but so far what we have works surprisingly well, so we’re not in a huge hurry to implement something and want to slowly explore the design space.
To some degree it can, but there are a number of problems with using Swift for this:
(1) Swift does not support multiple dispatch. That limits the way you can glue together unrelated libraries.
(2) Swift does not use abstract type hierarchies very much. E.g. in Julia one frequently define functions to use arguments of type Number, AbstractArray, AbstractString etc. This means it is easy for somebody in the future and define an array that works on a GPU or which is statically allocated and all existing functions operating on arrays work just fine.
I can invent a whole new number type and make it a subtype of Number and all my existing algorithms operating on numbers work just fine. One example of this would be Dual Numbers, which allow automatic differentiation. This was fairly easy to accomplish in Julia, but has been a major undertaking on Swift, which I don't think is done yet. I think they actually have to change the whole compiler. For Julia this is just a library thing.
(3) Swift function and method syntax is a pain to work with. For purely object oriented code it is very nice to read with parameter names. But once you get into functional programming and composition I find that it just creates a mess. I have to fiddle way too much with my Swift code to get function composition working as I desire. With Julia it is straightforward.
I would say composition is easier when everything is just based on the same function syntax.
That makes sense. Swift is also way too complex and syntatically noisy imo. I like that Julia has a smaller set of very powerful abstractions.
Though isn't point (2) just a convention thing? Protocols can refine other protocols. So in S4TF there's a layer protocol and an RNN protocol which extends that, IIRC.
I don't think so, I may be wrong, but I am quite sure you would get a significant performance penalty in Swift if you used protocols all over the place.
For instance if `func foo(bar: Number)` in Swift would give bad performance I believe as the number object would have to be boxed.
Julia can work with abstract types in a lot of instance without getting any performance penalty due to how the Julia type system works and Just in Time compilation. I don't quite see how a statically typed AOT compiled language could achieve the same.
Must confess I am a bit too tired to parse that text effectively at the moment, but I don't think that is a solution. It is basically a solution to deal with generics across libraries. In C++ this is a big problem right. Templates cannot really be put in libraries. You put them in header files.
So if you put generics in a library and link it, how are you going to know what to specialize and what not to specialize? That is the problem it seems to be they are solving here.
But this is still a compile time issue they are solving. What I am talking about is an issue that happens at runtime.
If I call a function f(x, y, z) and don't know the exact types of x, y, z at compile time, then Swift has no way of generating an efficient implementation of f. Julia OTOH due to its support for multiple dispatch CAN create an efficient implementation of f(x, y, z) for all possible types of x, y and z.
Actually on further reflection I cannot see any way an AOT compiler can solve this problem. Say you got this definition:
f(x: Number, y: Number, z: Number)
A Swift library could in theory compile all sorts of concrete variations of this function for concrete number types. However there is no way it can provide all number types. The user could provide new subtypes of Number not known when the library containing f was created.
I was a big Swift fan before and did not like JITs but Julia really convinced me how absolutely amazing Just in Time compilation is, especially combined with a dynamic language. You can just do so much crazy stuff that you have no way of achieving in a sane way in a statically type ahead of time compiled language.
You're right that in general an AOT compiler cannot solve this problem. Swift does specialize generic functions within a module. Across module boundaries, you can declare a function with the `@inlinable` attribute, which makes its body available as part of the module's binary interface. Of course this hinders future evolution of the function -- you can swap in a more efficient implementation of an algorithm for instance, but you have to contend with existing binaries that compiled an older version of the function.
The standard library makes extensive use of `@inlinable` for algorithms on generic collection protocols and so on.
Basically, because Swift has to support separate compilation of shared libraries.
The @inlinable attribute is in fact implemented by serializing the high-level IR of a function as part of the module's interface; but doing this for all functions would be a non-starter, because it would place unacceptable restrictions on binary framework evolution.
You can think of @inlinable as being somewhat similar to defining a function in a C header file. Unlike C++ templates, Swift generics don't require all definitions to be available at compile time, because dictionary passing is used to compile generic code when monomorphization cannot be performed.
I am a big fan of Julia, but Swift is perhaps the only statically typed object-oriented language (apart from Objective-C) which I have found offers some similarity in flexibility to Swift's way of dealing with types.
With the ability in Swift of adding extensions to conforming to a particular protocol to a class, you gain some of the same flexibility in Swift as in Julia.
It means you can take an existing class which was not designed in particular for some kind of abstraction and add conformance to an abstraction (protocol) you later added.
That is kind of what Julia gives you, with the ability to easily add functions dispatching on an existing type.
Say you got a `Polygon` and `Circle` type in Swift and Julia which you want to add serialization to without either one having been designed for it originally. In Swift I would define a `Serializable` protocol with a `serialize` method taking an `IO` object to serialize to. Then I would extend `Polygon` and `Circle` to implement this protocol.
The challenge in Julia is that I might want to define that only objects of type `Shape` can be serialized, but if `Polygon` and `Circle` was not already defined as subtypes of `Shape` I cannot do anything about that without changing source code. Swift has an advantage in his case.
My only alternative in Swift would be to create a Union type of all tye types I want to be serializable.
> The challenge in Julia is that I might want to define that only objects of type `Shape` can be serialized, but if `Polygon` and `Circle` was not already defined as subtypes of `Shape` I cannot do anything about that without changing source code. Swift has an advantage in his case.
You can do this with traits. One pattern is that you can define
struct Shape{T} end
struct Not{T} end
has_shape_trait(::T) where {T} = Not{Shape}
has_shape_trait(::Circle) = IsShape{Circle}
has_shape_trait(::Polygon) = IsShape{Polygon}
and then you can write
serialize(io::IO, x) = serialize(io, has_shape_trait(x), x)
serialize(io::IO, ::Shape, x) = # shape serialization here
serialize(io::IO, ::Not{Shape}, x) = # Fallback code, or an error here (or just leave it undefined)
This requires a bit more boiler-plate than regular abstract types but it's a pretty powerful technique (and has no runtime overhead). I do dream of having built in traits someday though to remove some of the boiler plate.
__________
By the way, is this a typo in your first paragraph?
> I am a big fan of Julia, but Swift is perhaps the only statically typed object-oriented language (apart from Objective-C) which I have found offers some similarity in flexibility to Swift's way of dealing with types.
You seem to be saying Swift is the only language which is similar in flexibility to Swift. Maybe you meant to reference another language?
> With the ability in Swift of adding extensions to conforming to a particular protocol to a class, you gain some of the same flexibility in Swift as in Julia.
You can add methods or computed properties, but that's about it. That's only one axis of flexibility, and it's really only syntactic sugar for writing and calling your own functions. You can't add any other kinds of features, unless they chose to use protocols in their interfaces -- which they usually didn't.
For example, that page gives the example of adding precision to numbers in Julia. I'm not sure how you could do something analogous in Swift, short of writing your own numeric tower from scratch. In Swift 4 they did add a Numeric protocol, but it's not used much. It's probably hard to retcon this sort of interface onto a framework which was built around concrete structs from the start.
> You can add methods or computed properties, but that's about it. That's only one axis of flexibility, and it's really only syntactic sugar for writing and calling your own functions. You can't add any other kinds of features, unless they chose to use protocols in their interfaces -- which they usually didn't.
I don't fully agree with this. I can add an extension implementing a protocol. That allows me to use classes I did not create in some kind of new subsystem I have just made, which requires objects adhering to a particular interface.
For instance I can take a library X somebody else made in Swift and made my own serialization library Y. Then I can add a serialization protocol to all classes in X. I can then have object graphs consisting of X objects which can now be serialized by my serialization library Y. This is beyond just adding syntax sugar for function calls. You are dispatching on the object type.
> For example, that page gives the example of adding precision to numbers in Julia. I'm not sure how you could do something analogous in Swift, short of writing your own numeric tower from scratch.
Yes this is the limitation of Swift which I have tried to articulate elsewhere in this discussion. In Julia I can make a subtype of AbstractArray or Number and this type can be used in all sorts of existing Julia libraries. That possibility does not exist Swift and I am uncertain if it ever can be made to exist.
If a Swift function took an abstract number type as argument, then the value I believe would have to be boxed. I don't see how an AOT compiler could avoid boxing. I mean inside a library perhaps, but across library/framework boundaries I don't see how you could avoid it.
Unless Swift is fundamentally redesigned as a language, I don't think it can ever match Julia in numerical computing and composability. Although I find it far easier to work with than C++ with respect to composability. My language preference is probably Julia, Go and then Swift. Go is somewhat primitive but it is kind of fun to work with. I like that they dialed back the static typing a bit. Swift feels a bit too Nazi at times.
Thanks. I mean in julia you can use traits for that, but it's not built in (yet). Though there's no speed penalty, as you probably know.
So this is about extending types, but it sounds like swift is strictly "better" then, since it's also statically checked? Or is there something that multiple dispatch gives that substantively better?
I'm trying to get a feel for if the Swift for Tensorflow project will afford the same kind of composability, while keeping static type checking, modules etc (assuming they work out cross module code specialization, which I think is happening).
Julia is by far the favorite language I have written code in. It is extremely expressive, while also being easy to read. Most design decisions are spot on. For example, the language has syntactic sugar, but not too much. Everything in the base library makes sense and seems to be there for a purpose.
Other niceties are the meta-programming capabilities, which allow for things like inspecting the llvm code and printing a variable name `x` plus output by only typing `@show x`. Then there is the fact that anonymous functions actually look like how you would describe a math function! (That is, `f(x) = 2x` is a valid function, as is `f(x) = 2π`.)
However, there is one thing I do not like at all. That is the loading time of packages. When starting julia and running `@time using DataFrames` it takes about 38 seconds when recompiling some stale cache. If all caches are good, the the load times for some common packages still add up to 1.1 + 4.5 + 1.1 seconds according to `@time using Test; @time using CSV; @time using Dates`. Therefore, nowadays I prefer to use R. For most of my use cases R outperforms Julia by a factor 10.
I never was able to get julia to do what I want. If I were a data scientist who developed and maintained large libraries, then it would probably be great, but I'm not. I just want to quickly visualize and modify data, or maybe see how a model compares. Much more difficult to do simple things like that than in Octave/Matlab.
That’s interesting I would have thought it would be the opposite. Julia being good at smaller experimental work and maybe creaking when things get scaled and put into production. Coming from a matlab background I found Julia much more natural to pick up than say python.
Interesting post and excellent discussion. I have the following question to all Julia and/or Python experts here. What strategy for developing a cloud-native {science and engineering}-focused platform would be better, in your opinion, and why: A) develop an MVP and then relevant production platform in Python, spending some saved time and efforts (due to simplicity, [as of now] better tooling and much wider pool of experts) on development of more and/or better features; B) develop an MVP in Python, then rewrite it for production in Julia for much better native performance, GPU integration and potential use of macros for an embedded DSL; C) take more time initially to master Julia and develop an MVP and then the corresponding production platform in Julia from scratch?
EDIT: Forgot to mention that HPC workloads would represent a significant, albeit non-unique, aspect of the platform in question.
I’m a bit biased as someone who switched from Python to Julia for my physics research (I.e. I’m biased to believe I made a good choice and others should follow my decision making), but I think any extra effort you spend in the beginning to get it working in Julia will pay big dividends.
In the scientific world, Julia’s package ecosystem is already more developed than Python in some fields (Differential equations being one, but there are others), so you may not find that limiting you.
Furthermore, for the reasons laid out in the article, Julia is highly composable and empirically, has a far greater ratio of code re-use than Python. I believe you’ll see greater bang for your buck in Julia because code you write is more likely to be re-used, especially in scientific domains.
Your blazingly fast and thoughtful comment is much appreciated. Let's see what others have to say ... :-)
One clarification that I would like to make (which IMO diminishes your second point's potential value) is that the B2B SaaS platform that I plan to build would stay away from implementing a myriad of domain-specific scientific methods and algorithms (though I plan to provide a valuable core) and rather would enable users to integrate their own implementations (through a plug-in architecture). Thus, Julia's package ecosystem seems to be a factor of somewhat lesser importance.
I would avoid the two language problem so option B is out of the way. If you choose option A, then you run into a risk of having to optimize some part of the code in C/C++. So you end up with the two language problem again. If you care about performance, then option C is not a bad place to be. Learning Julia is not a big deal. It would not take a seasoned developer too much time to get up to speed with it. What I found is that the Julia community is so vibrant that you can get a lot of help to move along quickly. My two cents.
Thank you for sharing your helpful insights. I will continue learning Julia as well as exploring and comparing options (in addition to Python and Julia, I'm also considering Node.js and C#, though to a lesser degree). I have certainly noticed the vibrancy of the Julia community, however, to me, as a startup founder, the problem of talent availability still represents a significant issue - community help can get us only so far. Most Julia developers are academia/science-affiliated, with a smaller number of people employed in the industry, mostly in sectors like finance and energy - so, a very limited talent pool makes it quite challenging to build a very good engineering team, at least, in the near term).
As others have mentioned, Python does have JIT compilers. The problem is that havign a JIT doesn't solve the problem.
PyPy is often a factor of 10 behind julia performance and projects like Numba, PyTorch (the PyTorch people had to build their own Python JIT compiler yikes!), etc. will always have a more restricted scope than a project like Julia because Python's very semantics make many optimizations impossible.
Here's a great talk from the author of the famous Flask library in Python: https://www.youtube.com/watch?v=qCGofLIzX6g where he discusses the fundamental problems with Python's semantics.
If you fix these problems, you will end up changing Python so fundamentaly that you'll really have a new language. Generic CPython 3._ code will certainly not be compatible with it.
so in this case, once say the Julia ecosystem grows then migrate to Julia. or wait for optimizations to be done, e.g have pandas, numpy etc handle multi-core processors etc ?
There's quite fundamental optimizations that will be missing if separately compiled pieces cannot optimize together. These barriers disallow many things. Additionally, anything that relies on higher order functions provided by the user, such as optimization or differential equations, will run into issues because those functions cannot be optimized by some super good package developer, but instead have to come from the user.
This blog post highlights the massive performance advantages Julia can have when you start integrating optimized packages and allow for optimizing across these barriers in Julia, and also the pros and cons of allowing these kinds of cross-function optimizations.
Pypy exists, but as I recall, cpython never made use of a JIT simply because they wanted to keep the compiler intentionally simpler.
However as a result, the language isn't well designed for a JIT, and pypy has run into several headaches/blockers. Not sure if there's anything fundamentally blocking usage, or simply lack of manpower
Except they have no story for building composable libraries in a distributed setting. The story for distributed execution in Julia today is just use MPI, which is a terrible answer. Anyone who has ever use libraries backed by MPI in any language knows that they are inherently not composable. You can't just take an object partitioned across multiple nodes one way by one library and pass it into a second library that expects it to be partitioned a different way. As far as I can tell the Julia language has nothing to say about that, and that makes them a non-starter today for anyone trying to build composable libraries for distributed memory machines.
Depends on what you work on. When doing more computer science like stuff, such as computing memory offsets etc, then 0-based indexing is practical. But for numerical work 1-based indexing is usually easier to work with. Mathematical texts are already using 1-based indexing and hence that is what people are used to when thinking about math.
I work with both and I never found this a big problem. This is on par with complaining about whitespace in Python. I prefer languages to not be whitespace sensitive but it is not a big problem.
Although I pretty sure you will accidentally get more problem from Python whitespace usage than from Julia 1-based indexing.
And frankly since Julia can use any indexing, you can use A[begin:end] to refer to the whole range of an object. If you want the second item in any array you can just write A[begin+1].
Whatever I'm doing this is wrong. Simple example - I have Python sequence [0...N) to process. If members are independent, I could split it with easy in Python and do it in parallel: [0...N) -> [0...M) + [M...N) for any M between 0 and N. Basically, Python sequences/ranges/etc are monoids in many cases. Simplest composition rule - monoid, with unit element and associative composition. In Julia it really looks ugly, you have to make some effort to do it right.
For me, Julia people don't get range/sequence composition right, so ...
You're inventing problems that don't exist. Every thing you said is applicable exactly with 1, 2 or whatever based indexing. It literally doesn't matter.
Julia's sequences/arrays are monoids too, I don't see what it has to do with anything.
Simple example - I have Python sequence [0...N) to process. If members are independent, I could split it with easy in Python and do it in parallel: [0...N) -> [0...M) + [M...N) for any M between 0 and N. Basically, Python sequences/ranges/etc are monoids in many cases. Simplest composition rule - monoid, with unit element and associative composition. In Julia it really looks ugly, you have to make some effort to do it right.
For me, Julia people don't get range/sequence composition right, so ...
Because Julia is composable the shape of the array and the values in it are orthogonal while reaming performant (as discussed by the article). As demonstrated by OffsetArray how one access an array is also orthogonal to those other factors, while remaining performant (it's a zero runtime cost abstraction).
Why I have to make special efforts to make simple things? Simple example - I have Python sequence [0...N) to process. If members are independent, I could split it with easy in Python and do it in parallel: [0...N) -> [0...M) + [M...N) for any M between 0 and N. Basically, Python sequences/ranges/etc are monoids in many cases. Simplest composition rule - monoid, with unit element and associative composition. In Julia it really looks ugly, you have to make some effort to do it right.
For me, Julia people don't get range/sequence composition right, simple thing, basic thing, thus ...