Joker is not a dialect of Clojure hosted on the Go runtime. Joker is a dialect of Clojure using its own interpreted runtime, that happens to be implemented in Go.
That means you cannot use Joker to build statically linked Go binaries and you cannot interop with Go from your own code.
And if you're interested in the idea of having an interpreted implementation of Clojure, for scripting use cases and fast startup, I would recommend you look into Babashka instead of Joker: https://babashka.org/
Joker was the best interpreted Clojure prior to Babashka, and nowadays, I would recommend Babashka over Joker all the time.
The reason Babashka is better than Joker for this is that it is implemented in Java, so it can directly reuse the existing Clojure implementation and libraries as-is as part of its interpreter. That allowed it to quickly surpass Joker in the amount of Clojure features it support and libraries it exposes, and continue to quickly keep up with Clojure. I'd also say the main developer behind Babashka is more active.
Even though Babashka is implemented in Java, it is still a single statically linked binary runtime that starts extremely fast and has a small memory footprint.
Kudos to Joker though for pioneering the idea and leading to Babashka.
Edit: And if you were excited because you thought this would let you write compiled statically linked binaries in Clojure, well for that you want to look at GraalVM native compilation which lets you compile real Clojure programs as low memory, fast startup, statically compiled single binaries: https://github.com/lread/clj-graal-docs
A small correction, Babashka is implemented in Clojure, not Java, but, as such, it can also make use of Java libraries as part of its implementation. So Babashka can bundle both Clojure or Java libraries inside itself.
This project looks awesome, though I'm a bit sad that "performance" is a non-goal as stated in the README. Clojure being able to compile to a single shippable binary is just what I need in my life. Kudos to the author.
I explored this briefly but it ended up taking about 25 seconds to compile hello-world in Clojure. That struck me as causing a lot of pain to develop for.
That, combined with how you'd lose access to a lot of the java library ecosystem, makes it less appealing to me to develop on.
True, but you don't develop that way using graal. You develop java apps the usual way (eclipse, emacs/vim/maven, whatever) and create a fat jar (or uberjar). When you make sure the application is working, compile it with graal, creating a static binary.
Clojure cycle is even faster because you keep REPL open and evaluate code directly while application is running. After that, you build uberjar, then compile it with graal.
Would there not be a risk that you may get something working nicely in openjdk and then you go to do a native compilation, and it turns out that it doesn't work due to some new method call which uses reflection or something?
Reflection is supported by GraalVM native compilation, but you need to declare what classes you need reflection support for at compilation.
What you must understand though, is that you're asking for something contradictory. You cannot get runtime eval and code linking and also deliver a statically linked compiled machine code binary.
That's why statically linked to machine code binary languages don't offer dynamic code generation, linking and reflection that can't be declared or performed at compile time.
This is true for GraalVM. It means when you develop Clojure for GraalVM native compilation you can't leverage such dynamic behavior as well.
Though you can embed the SCI Clojure interpreter inside your application and do dynamic code evaluation with it at runtime.
GraalVM issue an error if it can't resolve the proper method call or if another Thread is involved in the computation so I find it pretty safe to use.
Of course, GraalVM doesn't exempt you from doing some tests.
It’s difficult to write a fast interpreter in Go. It doesn’t optimize interpreter inner loops particularly well, you can’t do the data structure tricks you can do in C, and there is no JIT compiler available like in Java.
The best way forward might be an option to compile some Joker code by generating Go code for it.
(This is based on writing an interpreter a few years ago. Perhaps Go’s compiler has improved?)
Just speculating on what GP specifically meant, but Go lacks the ability to pack structs, and probably lacks the equivalent of tricks like computed gotos [0] to increase bytecode interpreter speeds. In general, Go seems to (intentionally) lack a lot of low-level control of code generation, preferring a "there's one way to do things, and it either works, or it's a bug we'll fix" approach.
Which is probably for the best in "most" software, but interpreters typically use weird hacks to squeeze more performance out of the rock. Wren is a good example [1] of some of these optimizations, and has splendid comments/documentation.
For one thing you have C unions. You can store different types in the same place and use information stored elsewhere to know what the bits really mean. It’s horribly unsafe compared to a Go interface or the tagged unions in other languages, but uses less space.
Also, there is the NaN boxing trick [1], which allows you to store a value that is essentially a tagged union that is either a double, a pointer, or an integer, in the same space as a double. To get these in a safe language, they essentially have to be built in already.
Go’s interface type is very high overhead compared to the union types used in other languages because it uses an entire pointer as a tag. Though, it only matters in special cases like interpreters. Most performance-intensive code isn’t nearly so dynamic.
Yes, the problem is that you still need pointers, and furthermore, pointers to different kinds of objects, of different sizes. Also, Go’s garbage collector should know about the pointers.
You could represent the heap as an array, use array offsets instead of pointers, and do your own garbage collection, but that’s pretty low level and you might as well compile to WebAssembly.
I have long had the ambition to be the N:th guy to try Clojure on C++. For similar reasons. I have a pretty good idea how to do it, already got HAMT and some macroexpanded code to work. But then you remember how even ClojureCLR struggled at times, and that you'd be lightyears behind that, not to speak of CLJ/CLJS, and alone. Maybe one day? :)
I've recently started implementing a VM for Scheme [1] and was already envisioning to maybe add Clojure at some point. I'm writing it in plain C, since I don't see much reason to use C++ for the VM itself, and I like the simplicity and transparency of C. Also, I'm hoping to prove security at some point and C may make that easier (support from tooling like proof assistants). I don't think C++ would help enough to avoid security pitfalls by itself, unless one accepts the result to be very slow. But the VM does compile as C++, and complex data structures like HAMT may really better be written in C++, so I don't think I mind that. I think the approach to take for all speed-sensitive objects would be for them to be integrated directly with the GC (allocated from the GC heap, and traced via an object specific method).
I think recreating Clojure point by point is not optimal. There is some middle ground between full on Clojure and the straightjacket of the STL datastructures.
- Clojure is kinda difficult to reason about performance wise. The lazyness.. some weird corner cases (like how `(first ..)` is slower than `(nth .. 0)`
- The STL is very strict on the "zero cost abstraction" front.
Maybe something like Immer but with some syntactic sugar to make it more light weight like Clojure?
With hooking into GDB you could probably make a REPL as well.. GDB has live code reload, introspection - you even get state of crash and sane stack traces :)
> some weird corner cases (like how `(first ..)` is slower than `(nth .. 0)`
I would not consider this a corner case; it is stated in (first) docstring. (first) will create seq [1] object, then it will fetch the first element. (nth), on other hand, will not create seq object.
> The STL is very strict on the "zero cost abstraction" front.
Many edge cases which are lost in myriad of classes. For example, vector will allocate more memory than there are actually elements in it, and you never know when it will start resizing it (this is implementation detail). I would not call this as "zero cost abstraction".
I mean, I didn't say it's undocumented behavior :) but when the default performance of a basic function like `first` is not O(1) then that strongly suggests that performance is secondary to composability/flexibility
(I'm actually not clear why it doesn't alias to `(nth .. 0)` .. I'm sure there is a good reason)
In Clojure you don't really reason about vector's run time behavior at all like in C++. You can't even ensure cache locality b/c different items can have any size. You can try to be clever and fall back to Java arrays, but the documentation says "It is recommended that you limit use of arrays to interop with Java libraries that require them as arguments or use them as return values." and the language doesn't provide the syntactic sugar for their easy use with the other data structures. So it kinda looks like the language is actively trying to discourage you from using an efficient data structure.
My previous post was trying to say that maybe you could elevate arrays to be more first class, make `first` and company O(1), maybe lose some flexibility/composability in the process but find some performant middle ground between Clojure and C++. I have no idea if this is actually achievable.
btw, STL vector resizing is a total non issue ... Most of the time you'd just allocate sufficient space beforehand (then it's truly zero cost). If you need to do something dynamic and care about when it resizes then you can just resize manually as you go (that's also zero cost - b/c this resizing is unavoidable and part of your algorithm). The default auto-resizing behavior just handles the resizing for you in a sane O(1) way
Well, "incredibly" is a weasel word. 0.05 nanosecond is fast enough for most general use case, you can consider that incredibly slow or fast I guess depending what you compare it against or what your use case is.
The reason it is slower than `nth` is because `first` is a sequence function, so it first coerces the type into a sequence, and then it returns the first element of that sequence. The creation of the sequence object wrapping the vector is what slows it down compared to `nth`, but it is still O(1).
Yeah, I just tested it out and you're you're right, it doesn't depend on the size of the container. Sorry about that
But the difference is more than 5x. That's kinda nuts.. In the context of C++ you don't just shrug your shoulders at 5x.
There is a really long discussion on GoogleGroups about why this was designed this way.. and I'm not sure I entirely understood the rational so I'm wary to say anyone is wrong. I definitely want to give them the benefit of the doubt and say it's probably like that for a good reason. But people were surprisingly very dismissive of this being a problem. To me it looks kinda obviously ugly (and maybe that's a tradeoff worth making). Furthermore, you have `peek` for the tail position, but no equivalent for the first position? To me it looks strange..
I actually don't mind that Clojure doesn't emphasis performance. To me it's just part of the tradeoff. But if you point this out then passionate people come out of the woodworks and get really defensive about it :S
> I actually don't mind that Clojure doesn't emphasis performance
I think this is the statement that people get defensive about. Clojure doesn't try to be the fastest language in the world, so beating Fortran is a non goal, hopefully that's obvious, but it takes performance very seriously, often at the detriment of usability in fact. Chunked sequence is one such example, it's purely done for performance reasons, but causes people a lot of headaches as it creates quite a few gotchas around lazy sequences.
First is another example, but I agree this one depends on your own philosophy and Clojure has an unconventional philosophy here.
The philosophy is that the choice of data-structure matters, and when performance is required, the programmer should make the right data-structure choices and use the appropriate data-structure optimized functions.
You could say that goes even further, that really the choice of constructs and the design of how to implement something matters if your use case is performance sensitive, and so small performance improvements like first having a special case on vectors don't matter for performance sensitive use cases anyways.
What that means is that, from my understanding, Clojure core devs aren't opposed to `first` and `last` having a faster path in some special cases, but they see that as an unimportant feature, that's way lower on the list of priorities, and they aren't convinced it's even worth the added complexity it would bring to the language implementation.
Effectively, where some languages say, this abstraction is so cool, because you can swap out the collection for a different one and get a free speed boost, Clojure says: this abstraction is so cool, because there is no collection you can switch too that would make the performance degrade.
So when it comes to performance trade offs, Clojure focuses on making the default and most straightforward way to implement something "fast enough" and scalable to larger inputs. You can think of this as for most use cases, the time it takes for a O(1) will be sufficient for the user, and your program won't choke if your input size grows larger.
And in the more exceptional cases where the O(1) isn't fast enough, or you have ridiculously large inputs beyond what even the common "at scale" use cases are. Well for those, Clojure wants to quickly make you realize you should not try to use the default and most straightforward way to implement this, but instead you need to immediately design for that performance.
And this isn't black or white either, there are cases where something was deemed worth having a faster internal implementation or special case, because it would have a substantial overall impact, for example with chunked sequence and transducers, or with how small maps are made into an array map, or with transients, etc.
Thanks for the perspective. You're a lot more knowledgeable about this than me and I appreciate you taking the time to explain it. I think I see your point about why people are defensive. If you're coming from Python or something then performance does look like a priority. That said Clojure is not only not trying to be Fortran, but it's also kinda not trying to match Java in terms of performance granularity. You can in effect write Java in Clojure, but the language seems to be discouraging you. Things start to look ugly and hard to parse.
> The philosophy is that the choice of data-structure matters, and when performance is required, the programmer should make the right data-structure choices and use the appropriate data-structure optimized functions.
Yeah, that's very sensible. When you know you're not just getting some seq, and you know you're going to get a vector - then you should use vector specific functionality. But I sort of don't get the impression the core library is compartmentalized this way. Maybe I've somehow overlooked it.. For instance I don't think there are a lot of function that take a vec on input, so in the vector case .. uhh what is the vector specific way to access the first element quickly? (not being snarky, genuinely asking)
> Clojure wants to quickly make you realize you should not try to use the default and most straightforward way to implement this, but instead you need to immediately design for that performance.
Yeah, things like VisualVM quickly make this apparent :) I guess my mild annoyance is at how ugly the code starts to get when it's optimized.
Tangentially, you might be a good person to ask: Do you happen to know why Java arrays aren't first class with their own syntax sugar like [],{},#{} etc ? Why are the docs discouraging their use? Is it because the other datatypes can't be coerced to a array? I often want to use them, but I feel I'm working against the grain of the language and doing something wrong. But I don't really appreciate the tradeoff involved
> You can in effect write Java in Clojure, but the language seems to be discouraging you. Things start to look ugly and hard to parse
Clojure is really a functional language at heart, so when you need to go down to an imperative model for performance I agree that's when things start to feel not quite right, and you begin to fight the language a little, the lack of standard imperative loops become an obvious annoyance. But Clojure's stance is that it's hosted for a reason, and just write those imperative pieces in Java which is designed from the ground up for imperative programming, Clojure doesn't try to pretend like it can do imperative better than Java.
I'd say on the performance scale, Clojure tries to be a fast functional dynamic language. A lot of the dynamic behavior have a runtime cost, so there's a limit here in terms of raw performance, where the best it can hope to achieve is be as fast as statically typed Java.
I think as a fast functional dynamic language it succeeds, and this is where most of the effort has gone in terms of optimization.
> When you know you're not just getting some seq, and you know you're going to get a vector - then you should use vector specific functionality. But I sort of don't get the impression the core library is compartmentalized this way
> For instance I don't think there are a lot of function that take a vec on input, so in the vector case .. uhh what is the vector specific way to access the first element quickly? (not being snarky, genuinely asking)
Like the cheatsheet shows, there's a few different vector functions for getting elements:
([1 2 3] 0) ;> 1
(get [1 2 3] 0) ;> 1
There's not a function dedicated to getting the first element like there is for sequences though. So you'd just get the element at index zero.
For manipulating data in vectors, there's mapv and filterv and reduce, and you can leverage all of the transducer functions as well.
Basically the trick for performance with vectors is to use transient when creating them, use reduce to iterate over them (or a function based on reduce like mapv, filterv and all transducers), and use their index for getting elements from them.
> Tangentially, you might be a good person to ask: Do you happen to know why Java arrays aren't first class with their own syntax sugar like [],{},#{} etc ?
Clojure is very opinionated towards functional programming and immutability. Arrays are an imperative construct with mutation and specific memory address lookups as their basic premise. Clojure is also opinionated towards data driven/value driven modeling. Arrays aren't very good for that, they are homogeneous and they don't have value semantics.
Basically from the point of view of Clojure, arrays are a specialized tool that you should only need to use in special cases, because of some special non-functional requirement. And because they don't follow the standard Clojure data expectations: immutable and heterogeneous with value semantics, Clojure wants it that when an array is used, it is very explicit and obvious, so it uses different functions for them, that also happen to be better suited for arrays.
> I often want to use them, but I feel I'm working against the grain of the language and doing something wrong. But I don't really appreciate the tradeoff involved
I mean, you shouldn't use Clojure if you don't favour the functional programming style. One of the tradeoff is slightly worse performance, because computers are imperative machines. People who use Clojure are happy with it, because they believe they get better modularity, reuse, safety and productivity from functional programming, and lose very little efficiency and performance that it is a good trade off, where users won't notice the difference, but the programmer gains a lot of benefits.
Clojure cares about making sure that it delivers a fast and efficient implementation of a dynamic functional programming language. That's why it provides a good implementation of persistent collections, which are non-trivial to write, that's why it batches lazy-sequences, that's why it uses the JVM JIT runtime which does a great job at dealing with lots of small temporary objects and optimizing away extra indirections. That's why it goes out of its way to allow type hints for faster dispatch, of providing records over maps, array-maps for small maps, transients for local mutation over immutable collections, transducers and reducing fast iteration for data manipulation, primitive handling in loop/recur, etc.
I think you'd describe Clojure's performance target as: sufficiently performant. The claim is that it can give you dynamic and immutable data semantics that is sufficiently efficient to be used for practical applications, with paths towards selective optimizations where needed, like allowing full mutability in deftypes, and such.
The one thing it almost never does though is give you unmanaged mutability. It's pretty resilient to that, that's why you won't find anything faster than a volatile! (except for arrays). It doesn't want you to start having shared memory bugs in concurrent contexts. Though libraries exist for those.
Finally, I wouldn't say it discourages the use of arrays, when arrays is what you need it gives you the tools to do so. But when arrays is what you want because you have performance OCD, but don't actually need that level of performance, it would rather you used functional, immutable, heterogeneous with value semantics data-structures and functions instead, and will nudge you towards those.
The cheatsheet is very handy! Thank you. I'd completely forgotten about it. I usually end up poking around Clojuredocs, but it's very unstructured and you're just hunting for a function that fits the bill.
And yeah, your analysis of functional programming paradigms in Clojure mirrors how I think of it as well. I see that a lot, how the language is really pushing you to program functionally and almost intentionally making imperative/mutable programming painful so that you don't do it :)
Now that you've explained it it does seems obvious that arrays don't work too well with immutable structures. You're right about the OCD. C++ kinda instills that in you a bit. It feels "wrong" to have sequences of the same datatype (like pixels in an image) use the same container as sequences of varying data types (like a hiccup vector). For instance cache misses are something one is used to worrying about in C++ while in Clojure it's not even on the radar
I really appreciate you taking the time to explain things - especially so deep down in a thread
Fluokitten and Neanderthal are great for high performance numerics. Can't really get any faster than Neanderthal (which uses MKL and the GPU for compute). And fluokitten can make working with arrays a bit nicer, in that it'll look a bit more like working with sequences, though you still need to be careful about the functions you use with it to avoid boxing.
oh woah very cool! sorry for the late response - I wanted to take the time to look at it more closely. Thanks for that info. I'd never really looked into Fluokitten. It looks awesome :D
I've played with Neanderthal but the MKL dependency is a bit annoying (doesn't work on my Pinebook Pro and I can't ship binaries with it) - so I'd overlooked Fluokitten entirely. But this is very complementary to the Clojury way of doing stuff and has no dependencies. I'm def going to try to work this into my standard toolbox :)) I've only done the normal java-interop stuff in the past but this looks like a definite step up
You see, I'd like to write more of my software in Clojure (or any functionally inclined Lisp). If there was a really good interop story with a non-gc:d native lang I could do more of it.
Clojure is a hosted language, so hosting it on C++ (or Rust etc.) would be sweet for me personally. Also, I like compiler projects.
I've never had to use it myself, but what part of the JNI/JNA is problematic?
One thing I'd sorta fantasized about is (in a very hand wavey way) using Clojure + ClojureCL to run OpenCL kernels. In theory if you were to bundled your JAR with POCL.. you could selectively run your tight-loops/kernels on the CPU or GPU. So it'd be effectively like launching little C snippets. Furthermore, from what I remember, kernels are compiled and executed at run time - so it'd really fit with the whole REPL workflow and they would effectively simply be inline strings of C code that get executed
I think the main catch for the moment is that ClojureCL uses a MKL matrices to talk to the OpenCL driver (driver? dispatcher? I might be mixing up my terms) - but that could be modified to a JVM container
Probably my own bias - but the only time I've really been let down by Clojure and missed native was when I need to optimize a tight loop. Once you optimized Clojure past a certain point (to eliminate reflection, boxing issues, destructuring etc.) it starts to get ugly.
You could check out s7 Scheme. I'm using it for music apps, and it shares a lot of Clojure's design decisions, like being fundamentally a lisp-1 but having many CL inspired features. It's very minimal, and super easy to embed in a C/C++ host. And liberally licensed.
It would seem to me the most significant challenge would be having functional data structures in C++, that would be a non-trivial effort I would think. Even if you were to use other libraries that experiment with this. Without those, the language just doesn’t make any sense.
Yeah, that was the problem with most attempts I saw, unordered_map won't work. I did write a persistent hash map and vector [1], but ended up mostly using them in C++ projects. I wouldn't advise anyone else to use them at this level of testing/maturity though. Writing them was super interesting.
My Park language does this
https://github.com/toymachine/park-lang
Currently the syntax is mostly JS but the semantics is all clojure. I recreated the map and vector datatypes in c++ by adapting clojures Java code. it needs a GC though which is another big part of the implemention of Park
Depending who you ask, Clojure provides some benefits over other Lisps unrelated to its ability to leverage the JVM.
Some of those are:
- Default efficient immutable collections
- Good support for safe concurrency and multi-core programming
- Extended homoiconicity to include maps, vectors and sets as well as lists
- Convenient short lambda syntax
- Extensible data representation
- Distinguishes between nil and empty list
- Distinguishes between keywords and symbols
- Real boolean types
- Function arity overloading
- Well thought out abstractions and set of standard functions: collections are abstractions, lazy sequences, transducers, relational set manipulation, etc.
- Only two kinds of equality: by value or by reference, with the former being the default
- Is a lisp-1 like Scheme, but also with more batteries included like Common Lisp
- Slightly different visual appearance in its syntax, due to support for literal maps and vectors as well as lists, which can make things easier to visually parse.
Some people like Clojure's data structures, which add a couple of things to lists. Also, there's ClojureScript, which is huge. And, ideas such as software transactional memory.
You can just use GraalVM for that. Also, Joker does not let you do what you want, it is not Clojure hosted on the Go runtime, it is a Clojure interpreter runtime implemented in Go, so with Joker you always need the joker binary.
If you want to produce native shippable statically linked binaries with Clojure look at GraalVM native compilation:
The performance non-goal is there to lower expectations compared to compiled Clojure. Babashka's author (me) does try to make it as fast as possible within the constraints it's implemented in.
For me the most interesting compilation target for Clojure is Dart. Writing Flutter apps in Clojure would be the dog's spherical objects. David Nolen thinks likewise and has commented on it in a recent defn podcast. Christophe Grand and Baptiste Dupuch are the brains behind it: https://twitter.com/cgrand/status/1350063059864346624?lang=e...
Has Flutter really become that popular and awesome? I feel I'm still seeing a lot more of React Native or Electron and other JS based front-ends in the wild, for which ClojureScript is best suited. I'm not sure I'm convinced Flutter will be able to break through those even with Google behind it.
I've written a few things in it and found fewer footguns than React Native.
Granted, it comes with its own set of idiosyncracies, breaking version changes, half baked community libraries and developer ecosystem, but that's par for the course.
I realize the project is no longer being developed by the original group, but is there a recent fork by others or version of cyclone for 64-bit computers?
I've given up on Rust for me personally, but I think it is a great PL and effort. I am still playing with Zig, but I would love to just leverage my C background, and see if Cyclone eases me away from always coding in C.
If you had issues with Rust, with Cyclone it would be even more complex, given the way its type system works, in fact Rust is must more ergonomic from what I can gather from the surviving Cyclone papers.
I am also learning SPARK 2014, a subset of Ada, that is complex, but for me not too complicated. I find Rust complex and complicated, maybe others don't. I will probably stick with SPARK 2014 for my high integrity experiments with hardware and controls. I use C to do a lot of other prototyping and quick apps. My thought is that I will wait for Rust to mature more, but I think it is always going to be complicated and suit the applications it was created for, and I may never have a need for it, or to learn it sufficiently to be able to switch to it.
To many, interactive development and debugging is the main point of the Clojure REPL. (Also true for many dynamic languages.) I take it you don't code and debug using such techniques. Some feel they are a revelation.
Joker is not a dialect of Clojure hosted on the Go runtime. Joker is a dialect of Clojure using its own interpreted runtime, that happens to be implemented in Go.
That means you cannot use Joker to build statically linked Go binaries and you cannot interop with Go from your own code.
And if you're interested in the idea of having an interpreted implementation of Clojure, for scripting use cases and fast startup, I would recommend you look into Babashka instead of Joker: https://babashka.org/
Joker was the best interpreted Clojure prior to Babashka, and nowadays, I would recommend Babashka over Joker all the time.
The reason Babashka is better than Joker for this is that it is implemented in Java, so it can directly reuse the existing Clojure implementation and libraries as-is as part of its interpreter. That allowed it to quickly surpass Joker in the amount of Clojure features it support and libraries it exposes, and continue to quickly keep up with Clojure. I'd also say the main developer behind Babashka is more active.
Even though Babashka is implemented in Java, it is still a single statically linked binary runtime that starts extremely fast and has a small memory footprint.
Kudos to Joker though for pioneering the idea and leading to Babashka.
Edit: And if you were excited because you thought this would let you write compiled statically linked binaries in Clojure, well for that you want to look at GraalVM native compilation which lets you compile real Clojure programs as low memory, fast startup, statically compiled single binaries: https://github.com/lread/clj-graal-docs