> For example, a trivial hello world program in Julia runs ~27x slower than Python’s version and ~187x slower than the one in C.
I don't think it makes any sense to speak of "__x slower" for hello-world. Clearly, this is just a benchmark of startup time, so you only pay it once per program. It should be reported as "__ms slower".
Julia startup (according to this post) takes 371ms. That's 357ms slower than Python, and 369ms slower than C. Faster is always better, but this doesn't seem so bad to me.
For comparison, on my old workstation here, starting a Swift repl takes 2724ms, and starting a Clojure repl takes 4792ms.
Sure... and it means that Swift and Clojure are just as useless as Julia for numerous use cases where start-up times matter, like piping CLI commands together, in Julia's case you're looking at pathetic 3 execs per second.
Moreover, even in some server-side applications it's super neat to have the luxury to spawn a new process to service certain requests, not having to worry about memory leaks. It's a perfect "API" which allows multiple languages to interact together.
It always bothered me when people dismiss the start-up time by adding "just" in front of it. We're not talking about web frameworks, these are _general purpose_ programming languages, and horrendous start-up time automatically disqualifies them from being general purpose and places them into a niche category, in my humble opinion.
> it means that Swift and Clojure are just as useless as Julia for numerous use cases where start-up times matter
Or that different languages require different approaches, and you can't translate 1:1 between technologies. My Mac starts up 27x slower than my C=64, but that doesn't make it useless for any task that requires turning it on.
> We're not talking about web frameworks, these are _general purpose_ programming languages
Are we? I've never written a line of Julia in my life but the impression that I get is that this isn't intended to be a general-purpose language:
> Of course, one might argue that Julia is not intended to be a general-purpose programming language, but a language for numerical computing.
> As has been pointed out elsewhere “Base APIs outside of the niche Julia targets often don’t make sense” and the general-purpose APIs are somewhat limited.
The Julia webpage has 6 tabs listing features, and 5 of them are about numerics. The 6th says "General Purpose", but it's mostly about FFI.
Julia is intended to be (and already is) a general purpose programming language. What it does not try to be is a language that tries to be good for every purpose. Python for example is a general purpose programming language but you wouldn't use it to write kernel modules, and likewise you wouldn't use C to write your simple website (even though you obviously can). A language that tries to be everything to everyone will either be too bloated and complex, or lack the included batteries for pretty much anything.
Even if we don't use technicalities, Julia is a very powerful language that allows people to extend it for purposes it was not built for. If it didn't support JSON, you could write a JSON type as integrated and fast as what the stdlib offers, it's metaprogramming/multiple dispatch paradigm allows for easily creating frameworks for many purposes and while numerical processing gets special attention the language will still outperform in terms of speed most dynamic languages in any domain.
And I do think short scripts are within Julia's main targets and are merely victims of the fact that the language is still too young and the battles the devs chose to fight within that limited time (and either AoT options or an interpreter that runs code while it's compiling could solve it really well for example, but both would require a lot of time and work which could be used for other features).
The poster said "(even though you obviously can)".
People will try all sorts of weird language experiments, so it's no surprise those libraries exist. But do you really expect a significant number of web developers to shift over to C, just because it's technically possible?
I did generalize above, but when I said "simple website" I really meant the usual simple website one the web, not firmware/embedded stuff. I kinda said that because I already wrote one in C including a basic CGI library, as well as a mid-sized one with Rails, and the difference clearly falls into "C doesn't try to be good at website creation, but you can obviously do it" (and in some occasions you just have to). I'd also add that Roller Coaster Tycoon was written in Assembly, but it's not a language that is particularly good for game dev in general (but still a completely general purpose programming language).
Micropython is not really python (which by definition is defined by PEP and CPython implementation), it's a language largely similar to python 3 since you can't just pick a python program or library and run on it at all times, but it's not really a relevant discussion (and it's nice to have multiple variants of a language you like that are good for multiple purposes, which solves the bloat problem as long as you use only one of them at a time).
Regardless, I hope Julia gets to the point that you can target as many places as some of those languages even if it's not nearly the best in each domain (like WASM, embedded, OS stuff, shared libraries).
Actually Assembly is how most 8 and 16 bit titles were written on, higher level languages were the Unity from 80-90's gamedev scene, Roller Coster Tycoon isn't alone.
Quoting the relevant quote "you wouldn't use C to write your simple website", except that is exacly what EE does when on device memory is measured in KB.
C++ would be safer, but the C89 culture reigns in such domains.
Not necessarily. It's entirely possible to keep a daemon running with the loaded runtime to speed up start time. I'm not arguing if and whether these languages are suitable for scripting rather startup time alone need not be a disqualifying factor. If you evaluate holistically and arrive at a language you'll always have one thing or the other that is not as what you want.
A few years ago, as a hobby project, I designed a programming language and wrote an interpreter for it in Java. (I've never released it or shared it with anyone; I was never quite happy with it and eventually moved on to other things.) As you can imagine, an interpreted language with the interpreter written in Java is going to be rather slow to start.
So, I implemented exactly the solution you describe here. I made my interpreter run as a daemon and listen for requests on a Unix domain socket (with a custom binary protocol). I then wrote a client in C, which opened that Unix domain socket and sent it a script to run. The C program also redirected stdin/stdout/stderr to/from the interpreter through the Unix domain socket, if connected to a tty it could switch to/from raw mode, it notified the server if certain signals occurred, etc. Using the interpreter this way, startup time was a lot faster. This approach can be applied to any language which supports runtime evaluation of code.
A long-running background interpreter carries significant security risks, though. A process that lacks most linux capabilities could use it to regain access to things it shouldn't have access to. A process could interpose itself and get access to data and code it shouldn't. If there are multiple users involved, one could bypass ACLs and similar. Essentially, you're removing a significant proportion of the security guarantees the OS provides around isolated processes.
Some of the concerns you raise can be addressed. For example, make the daemon and socket per-user and give the socket 600 permissions. Also, the SCM_CREDENTIALS message (on Linux) or LOCAL_PEERCRED/getpeereid (macOS/*BSD) can be used to validate the caller has expected UID.
The issue with missing-normal-capability processes is more difficult to address. One possibility, at least on Linux, would be to get the pid from SCM_CREDENTIALS message, and then read /proc/$PID/status to check capability bits. It could default to denying access to processes with less capabilities than it itself has.
(SCM_SECURITY can be used to pass the SELinux security label from client to server, which could also be used as a security measure; maybe the server could refuse access to processes running with a different label, or have a whitelist of allowed labels; if SELinux is being used to sandbox a process, that would prevent it from accessing the background interpreter unless that was explicitly allowed by whitelisting.)
It doesn't work very well. I got SEGFAULTs every time I tried; filed a bug report; no resolution. So, while it does exist, it's not a high priority for the Julia team, so I wouldn't rely on it.
I had the exact same experience when I tried over a year ago. The package was eventually closed IIRC, but a recent news letter claimed to have fixed it.
How big are the resulting binaries these days? It’s been some time since I tried this, but I decided that the size of the compiled code didn’t fit my use case (a serverless function IIRC)
Julia's default model is that script functions are only JIT compiled when run (just like the JVM, etc.), but you can just as easily force the compile and stick it in an image.
Have you actually tried the instructions in there? I did, to get a better time for julia in the n-body programming languages shootout https://benchmarksgame-team.pages.debian.net/benchmarksgame/... (I am the current leader among julia implementations). Suffice it to say, it did not work. I mean i'm not a brilliant programmer or anything, but I would hardly say it's "just as easily".
I am a Julia noob and managed to use PackageCompiler.jl for a package I wrote and it just worked out of the box.
I think the only non-stdlib package I used was StaticArrays, but I was able to use the happy path described in the PackageCompiler docs and got it working in an hour or so max. This was probably in March or April, so relatively recent.
Totally, this is why have the right tool for the job approach. I would not start up a Spark cluster for processing 100MB of data I also would not start to write a CLI tool in Clojure (even though it is one of my favorite languages).
I would write a CLI in Clojure if it were a reasonable option. Something like Gambit or Chicken but for Clojure would be a dream. I guess there’s Ferret, so maybe I should give that a try.
You got me thinking about that word, CLI. I know that usually means a Unix shell, but when I just run my bespoke file conversion tool from CIDER REPL, that interface is also a command line one, what else is it?
> Sure... and it means that Swift and Clojure are just as useless as Julia for numerous use cases where start-up times matter, like piping CLI commands together, in Julia's case you're looking at pathetic 3 execs per second.
The REPL is a tad slow but compiled Swift binaries have virtually no startup time overhead.
Eh? Swift surely doesn't fall in this category, as it's meant to be used compiled, not from a REPL, and doesn't have a massive heavy runtime like Clojure. I refuse to believe that a Swift CLI compiled to native code as intended has a startup time problem.
I’ve used it a few times in the past experimentally. It definitely improved startup time. It was hard to set up though, so I generally didn't find it useful.
I’ve become disenchanted with clojure over the last couple years, so I haven’t tried anything recently.
> If you ignore startup time, Julia might have good performance for simple array/matrix operations and loops, but we already know how to make them fast in Python and other languages.
> And it’s not just scripts, Julia’s REPL which should ideally be optimized for responsiveness takes long to start and has noticeable JIT (?) lags. What’s even more worrying is that there doesn’t seem to be much progress there. The REPL was a pain to use a year ago and it still is.
> In addition to that, Julia programs have excessive memory consumption. The above hello world example in Julia uses 18x more memory than Python and 92x more memory than the C version.
> The above hello world example in Julia uses 18x more memory than Python and 92x more memory than the C version.
Again, I'm left wondering if this is proportional, or simply an X megabyte overhead of the runtime. Maybe it's even shared among all running programs, as some runtimes are. Or it looks like maybe stdio could be just particularly bad in Julia, and in practice I've never written a program for any environment where that was my performance bottleneck.
Performance is not easy to measure, and not easy to report. A single number makes a good headline but it's really not enough information.
Which, by the way, only applies to Julia programs under JIT. The AoT compiler will handle nearly 100% of vast the JIT or REPL will and vice versa, with a goal of 100%.
You can't compare starting up a statically-typed, compiled Swift repl to those other dynamic, interpreted languages -- it's doing a lot of things that running a normal, compiled Swift program wouldn't do, and that dynamically-typed Clojure repl doesn't do (even though it's slower).
The Swift repl is essentially a debugger running a compile-run cycle on each entered expression. Some statically-compiled languages have slow compilers but extremely fast runtime execution (Rust, and to a lesser extent C++, come to mind).
Hardware is so complex and there is so much variation in it that this is "obviously" not true.
As a simple example, if two computers are comparable in speed but one has a bigger cpu cache or faster ram, they might run programs with low memory footprint at the same speed, but one will be much faster when running a program that uses a lot of memory.
But people who like the language will invariably use beyond its core competency.
Hence it is important to ensure that julia is "kinda mediocre" for CLI scripting (big step up from "absolutely terrible"). Personally, my familiarity with julia and its features outweighs the fact that python/perl/bash would be the better tool for many CLI scripts.
I think it's disappointing that Julia is not meant for CLI scripting. IMO it has failed to live up to its manifesto (https://julialang.org/blog/2012/02/why-we-created-julia) of being a general-purpose language that can "have it all".
My own bioinformatics work involves parsing massive amounts of text data, and Julia is excellent for this -- it is very fast, yet high-level and easy to write. However, bash pipelines are also a huge part of bioinformatics, and Julia scripts are not very good for this due to the long startup time, which is a shame.
> I think it's disappointing that Julia is not meant for CLI scripting. IMO it has failed to live up to its manifesto (https://julialang.org/blog/2012/02/why-we-created-julia) of being a general-purpose language that can "have it all".
I think it’s still early in the languages development to say that. There isn’t any fundamental reasons Julia couldn’t be tuned to be better at CLI scripting. Currently it’s what, 400ms to startup, which isn’t too terrible but could be made faster. I could see using the Julia debugger as an interpreter for CLI scripts. Or if you run a script a lot it’s possible to have Julia compile an executable. Personally I use it in mainly via notebooks or a repl.
I helped work on getting that number from ~400ms to ~150ms maybe 18 month ago for the v1.0 release. FWIW, a big part of it was shedding a couple excess C libraries with bad load times (usually now lazy loading them when required instead). The next big jump will take more internal effort, but I don’t think there’s any serious showstopper. The bigger short-term interest though has been towards reducing the latency of loading external (user) code/libraries.
For an example of what I mean by the latter, python seems to be pretty fast initially, but then seems to take a huge hit from just trying to get numpy loaded.
$ time python -c '0'
real0m0.024s
user0m0.016s
sys0m0.008s
$ time python -c 'import numpy'
real0m0.165s
user0m1.508s
sys0m2.312s
$ time ./julia -e 0
real0m0.215s
user0m0.240s
sys0m0.144s
I believe you compared first (uncached) startup of Python against second startup of Julia:
$ time python -c '0'
python -c '0' 0,03s user 0,01s system 7% cpu 0,467 total
$ time python -c '0'
python -c '0' 0,03s user 0,00s system 98% cpu 0,030 total
$ time python -c 'import numpy'
python -c 'import numpy' 0,17s user 0,05s system 9% cpu 2,401 total
$ time python -c 'import numpy'
python -c 'import numpy' 0,11s user 0,01s system 99% cpu 0,118 total
$ time julia -e 0
julia -e 0 0,25s user 0,27s system 17% cpu 2,868 total
$ time julia -e 0
julia -e 0 0,09s user 0,05s system 92% cpu 0,155 total
The impact of actually calling something from numpy is also negligible in Python but not in Julia:
$ time python -c 'import numpy; numpy.random.rand(10,10)'
python -c 'import numpy; numpy.random.rand(10,10)' 0,10s user 0,01s system 99% cpu 0,116 total
$ time julia -e 'rand(10,10)'
julia -e 'rand(10,10)' 0,35s user 0,23s system 209% cpu 0,277 total
$ time julia -e 'rand(10,10)'
julia -e 'rand(10,10)' 0,36s user 0,22s system 209% cpu 0,278 total
We got approximately the same numbers (your clock speed is likely to be much higher). User/system time is quasi-bogus, since it’s a high core count system (although still a bit concerning). I accounted for possible cache effect by running each a number of times and reporting the last. I’m not really trying to make an absolute time comparison here, just pointing out that if 100ms is unacceptable, numpy would just miss that bar too. Once you’re past the bar of “this needs to be kept running”, I don’t think a constant factor of 100ms vs 1s makes much difference in QoL, and now we’re just comparing apples and oranges. A constant factor gain on the rest of the time can make a huge difference on the rate of results per second. But I actually hope both will improve!
Ah. I was trained by zsh's time reporting to focus on the last figure and noticed that "real" is at the top in your comment only after posting mine. And then still left scratching my head looking at "user" and "system" times an order of magnitude higher than "real".
Excellent! I was mainly going off the numbers given by another comment. Well ~150ms seems within useable CLI times for me. One note, I’d reckon to have a fair comparison to the Numpy timings the Julia example would need to use an array type to do something.
By "slow startup time" I also include the time to import libraries which is really the bigger problem. For example loading the DataFrames library takes quite a long time, and I've heard the libraries for plotting take even longer to import.
That’s totally unrelated to startup or compilation time. It’s due to a bug in the crufty old Windows CMD.exe terminal—if you use a real terminal it doesn’t happen.
I’m now actually curious - what’s the bug in cmd.exe and why doesn’t that happen with other REPLs? I’m well aware that cmd.exe has its fair share of weird bugs, but on Windows it’s sometimes the only thing available. Plus, every third party terminal basically has to interface with the Windows GUI command-line program via screen-scraping, and so they inherit many of the same bugs. (The Windows Console architecture is fascinating - see https://devblogs.microsoft.com/commandline/windows-command-l... for a rundown!)
If we knew that it would be fixed by now. Julia's REPL only supports VT100 terminals and its descendants. My guess is that CMD.exe on older Windowses fails to emulate VT100 correctly somehow.
I want to know what business anything running on a modern computer has taking that much time to start up. Aside from the fact that 5 seconds is an insane amount of time, this is before any user code has been loaded, so presumably your Clojure repl is executing the same 5 seconds of work every time it starts. Can't that be cached?
Developing tools for devs suck as they will use "hello world" as benchmark. And the tech stack will be decided by the non technical founder. Devs are more concerned about what tools others use rather then the pro et contra. /rant
I haven't understood it as benchmark, rather a smell "they did not bother to optimize this?". The computers are so fast today, that instant startup time is low hanging fruit for a language.
But it's an entirely 100% meaningless point of discussion though.
My peers and I use Julia to run numerical heavy code where the large majority of the time is spent inverting a large matrix to solve the equation Ax = b for x. It's a lot more complicated than this, but essentially it can be boiled down to this main equation. When my code takes hours, pushing onto days, to run, why should I care about 300 ms of startup time at all?
I stopped reading the article because their first point (the hello world benchmark) is entirely meaningless and is just noise in the wind. Their second point is a matter of personal opinion. After these two I didn't bother reading the rest.
I think a lot of people that set out to design languages don't start out with the top three design goals as, fast compile times, fast startup, and robust hooks for tooling. The problem is if you don't start out with those as primary design goals you're going to be totally hosed once they become a problem later.
I think the language benchmarks game or the techempower benchmarks are fairly widely known and are a step or two above comparing hello world performance.
I don't think it makes any sense to speak of "__x slower" for hello-world. Clearly, this is just a benchmark of startup time, so you only pay it once per program. It should be reported as "__ms slower".
Julia startup (according to this post) takes 371ms. That's 357ms slower than Python, and 369ms slower than C. Faster is always better, but this doesn't seem so bad to me.
For comparison, on my old workstation here, starting a Swift repl takes 2724ms, and starting a Clojure repl takes 4792ms.