Crystal: Fast as C, Slick as Ruby

anp · on Aug 4, 2016

From this post, Crystal appears to have some of the things many people have been lusting after in Rust: sophisticated metaprogramming, fewer sigils, a bigger standard library, fibers/coroutines/whatever-they're-called-now.

But it still has a GC :(. Rust has completely spoiled me with making it easy to minimize dynamic memory allocation and copies, and to know (almost always) deterministically when something will go away.

EDIT: I should also say that if you want to bash on Rust's lack of these things, 3 out of the 4 items I cited have solutions being actively worked on (either at planning, RFC, or implementation phase). I don't think Rust's sigils are going away any time soon, but I have no idea how you'd do that and preserve semantics anyway.

koffiezet · on Aug 4, 2016

Having worked in environments where GC was an absolute "no go", I'm always amazed that so many people have problems with a GC. Yes there are types of software where using a GC'd language would probably be a bad thing. If you're talking about huge projects with heavy performance constraints (os kernels, AAA games and browsers come to mind), I would probably try to avoid it.

But most likely - you simply do not need a language without a GC. If you look at the sheer amount of applications written in interpreted languages, anything compiled straight to machine code is a win, even with a GC. The interpreter and runtime overhead is so much bigger that a GC does not really matter in them, unless you're talking about highly tuned precompiled bytecode that is JIT'ed like Java and .NET, or natively compiled languages like Crystal and Go. So yes, when compiling to native code, the GC can become the "next" bottleneck - but only after you just removed/avoided the biggest-one. And that 'next' bottleneck is something most applications will never encounter. I initially thought of mentioning database engines in the above list of "huge projects with heavy performance constraints", but then I realized a good number of specialized databases actually use runtimes with a GC. Hadoop stack with especially Cassandra, Elasticsearch? Java. Prometheus and InfluxDB? Go.

Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this. The popularity and "I want to be cool so I hate it" trend of Go proves this, but the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.

So I really don't get the whole GC hate. If you don't want GC, there are already many options out-there, with Rust being the latest cool boy in town. But in reality there are huge opportunities and fields of applications for languages like Crystal and Go. And most likely - you could use such a language, only you don't think you do because you have an "oh no, a GC!" knee-jerk reaction.

anp · on Aug 4, 2016

> But most likely - you simply do not need a language without a GC.

Absolutely. That doesn't mean I can't want predictable performance or deterministic destruction. I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up (i.e. on the overhead of managed runtimes and GCs). Before, I'd have argued that it was just necessary. Having spent a bunch of time with Rust, I don't think so any more, and I'm really excited to see non-GC languages build on Rust's ideas in the future.

> Hadoop stack with especially Cassandra, Elasticsearch? Java. Prometheus and InfluxDB? Go.

Cassandra has a drop-in-ish C++ replacement (Scylla, IIRC?) which supposedly blows the Java implementation away in performance. A magic JIT (and HotSpot is really magic) doesn't make everything better all of a sudden.

In a somewhat recent panel (https://www.infoq.com/presentations/c-rust-go), the CEO of InfluxDB basically admitted that if Rust had been more stable when they started they would have been able to use it instead of Go and would have had to do far fewer shenanigans to avoid the GC penalty.

> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this.

Indeed. I'm not in denial of this. I made an offhand remark about my personal preferences and what I'd like to see from future languages. I still write a ton of Python for things where speed really doesn't matter.

> "oh no, a GC!" knee-jerk reaction

I don't think having a refreshing experience without a GC counts as a "knee-jerk reaction." I've thoroughly enjoyed not having to tune that aspect of performance, and I remarked on it. I think Crystal shows great promise, and certainly has the potential to offer easier ergonomics than Rust.

bluejekyll · on Aug 4, 2016

> That doesn't mean I can't want predictable performance or deterministic destruction.

Exactly. To just add to your point, there is no longer a reason to settle for GC pauses with Rust. It does require more thought while writing the code, but what you gain is a firmly consistent runtime. If your memory allocation is slow, you can create your own allocator/slab, and then use that for hot memory space and optimize it out.

As a longtime Java geek who never understood the argument against GC, this has been a mind altering experience. I was a big C++ person before, but after one too many memory leaks and segfaults, I could never imagine not wanting a GC. Then Rust came along and taught me better.

mike_hearn · on Aug 5, 2016

Two reasons:

1) Rust appears to be significantly less productive than a true GCd language. I see a lot of people talking about "fighting the borrow checker" with Rust and I see a lot of articles describing basic patterns that would be simple in any other language, but are complex in Rust.

2) If you want to invoke code that assumes a GC you need to have one.

You can do manual memory allocation in Java by the way and a few high performance libraries do. It's just not common.

It's interesting to note that the Chrome guys have gone in the direction of deploying a GC into C++ whereas the Mozilla guys have gone in the direction of moving manual memory management into the type system. I've got nothing against Rust but I'm a Chrome user, personally.

anp · on Aug 5, 2016

> fighting the borrow checker

My experience is that this is an initial hurdle to clear. As an example, I've almost exclusively worked in GC'd languages for a while, and after learning Rust for a few months I very very rarely have borrow check errors.

The fact that it occasionally requires a complex pattern to do right should get better with time (non-lexical lifetimes would help), and there's also discussion around GC integration so that you could interact with a scripting language GC when writing a plugin for it, or you could farm out GC'd objects when you need to have cycles (i.e. in graph algorithms).

> the Chrome guys have gone in the direction of deploying a GC into C++

Interesting. I'm curious how much of the browser relies on it. I'm also curious whether it's an attempt to paper over C++ with a little memory safety, or whether it actually offers performance improvements. My original point was not that GC is bad, per se, but that I quite like being able to avoid it when it's reliable to do so, which is not the case in C++, IMO.

dbaupp · on Aug 5, 2016

Mozilla has also put significant effort into improving their C++ GC in Firefox, e.g. switching from a non-generational one to a generation GC[0] and then to a compacting one[1]. Just like Google can both improve Chrome and work on Go, Mozilla both improves Firefox and works on Rust.

[0]: https://hacks.mozilla.org/2014/09/generational-garbage-colle...

[1]: https://hacks.mozilla.org/2015/07/compacting-garbage-collect...

mike_hearn · on Aug 5, 2016

Those are all about the JavaScript GC, right? Not using the GC for actual pure C++ objects.

dbaupp · on Aug 6, 2016

I wasn't giving examples of Mozilla doing exactly what Chrome is doing, just counterexamples to your implication that working on Rust means no work on improving memory management in Firefox.

That said, it's not even like a single-GC approach is incompatible with Rust: https://blog.mozilla.org/research/2014/08/26/javascript-serv...

steveklabnik · on Aug 5, 2016

  > I see a lot of people talking about "fighting the borrow checker" with Rust

It's also usually described as "at first, I fought the borrow checker, but then I internalized its rules and it's now second nature." You're not wrong that it's a hump to get over, but once you do, it's not a big deal.

  > that would be simple in any other language,

Any other _GC'd_ language. You still fight the same kinds of complexity when you don't have GC.

pjmlp · on Aug 4, 2016

The C++ advantage will not be so much after Java 10 comes out and finally have the value types and reified generics the language should have had since beginning.

Also I am yet to see any large scale production deployment of those Hadoop alternatives.

But it might still be like 5 years from now, so who knows how it will evolve.

anp · on Aug 4, 2016

Re: deployments, I expect that would take some time for a transition to occur. The first post date on the ScyllaDB blog is from February 2015 (http://www.scylladb.com/2015/02/20/seastar/), and it looks like it wasn't until September 2015 that they specifically started publishing benchmarks of the database itself as opposed to the network I/O library they built for it (http://www.scylladb.com/2015/09/22/watching_scylla_serve_1m/).

I look forward to those changes coming to Java, and I think that stack-based value types could do a lot for the language. That said, the Scylla folks seem to have gotten a lot of their performance gains from CPU/thread affinity and async I/O (http://www.scylladb.com/2016/03/18/generalist-engineer-cassa...). NIO is pretty great in Java-land, IIRC, but CPU/thread affinity is, I imagine, hard to pull off with a garbage collector.

Another thing I'm curious about w.r.t. value types in Java -- hasn't C# had those for a while? If so, and if your claim that value types will provide large performance benefits is true, why isn't C# always blowing Java away in benchmarks? Perhaps it is and I'm just not seeing them. Perhaps Java's escape analysis is already pretty good and solve the 60/70/80% case? Perhaps I'm not well versed enough in the subject to understand the interactions here.

pjmlp · on Aug 4, 2016

Regarding C#, Microsoft hasn't invested too much on their JIT/AOT compilers optimization algorithms.

NGEN was just good enough for allowing quick application startup.

Also they didn't invest too much in optimizations in the old JIT.

Specially since .NET always had good interop to native code via C++/CLI, P/Invoke and RCW.

There were some improvements like multicore JIT in .NET 4.0 and PGO support in .NET 4.5, but not much in terms of optimization algorithms.

Hence why .NET 4.6 got a new revamped JIT called RyuJIT with SIMD support and lots of nice optimizations.

But this is only for the desktop.

.NET for the Windows Store is AOT compiled with the same backend that Visual C++ uses. In the Windows 8 and 8.1 they came up with MDIL from Singularity/Midori but with 10 they improved the workflow to what is nowadays known as .NET Native.

With the ongoing refactorings they plan to make C2 (Visual C++ backend) a kind of LLVM for their languages, similar to the Phoenix MSR project they did a few years ago.

If you watch the Build 2015 and 2016 talks, most of them are making use of C# with the new WinRT (COM based) APIs, leaving C++ just for the DX related ones.

So they are quite serious about taking their learnings from project Midori and improve the overall .NET performance.

bunderbunder · on Aug 4, 2016

From what I've seen, stack based value types are not necessarily the big performance win they're touted to be. The rule of thumb I've noticed is that, if the struct is much bigger than the size of a pointer, you start seeing a pattern where it's quicker to allocate in the first place but slower to pass around.

I think this is because, on a platform like Java or .NET that uses generational garbage collection, the heap starts to behave like a stack in a lot of ways. Allocations are fast, since you just put objects at the top of the heap. And then, since they're at the top of the heap, they tend to stay in the cache where access is fast, so pointer chasing doesn't end up being such a big deal. On the other hand, if you use a struct, every time you pass or return it you end up creating a shallow copy of the data structure instead just passing a single pointer.

(Disclaimer: preceding comment is very speculative.)

pjmlp · on Aug 4, 2016

Kind of true, but you can minimize copies if the language supports ref types, which was already common in languages like Modula-3, D or even Eiffel.

Also a reason why C# 7 is getting them as return types in addition to ref/out parameters.

whateveracct · on Aug 4, 2016

> value types and reified generics

I think you mean specialized generics (i.e. no autoboxing of primitives when used in generics)? Reified generics implies carrying around all generic type information at runtime, which will not be the case and also has nothing to do with performance. Non-value generics will still be erased I thought.

pjmlp · on Aug 4, 2016

Have you seen the status update?

https://www.youtube.com/watch?v=Tc9vs_HFHVo&list=PLX8CzqL3Ar...

They will change the constant pool to have some kind of template information that gets specialized (what they call type species) into a specific set of types.

The plan is even if Java cannot fully take advantage of all possibilities due to backwards compatibility with existing libraries in binary format, the JVM will support it for other languages not tied to Java semantics and backwards compatibility.

bluejekyll · on Aug 4, 2016

Java 10? I'm still waiting for Jigsaw (originally slated for 1.8, is it coming in 1.9???)

pjmlp · on Aug 4, 2016

No need to wait.

https://jdk9.java.net/download/

jerven · on Aug 4, 2016

And Jigsaw has landed, build 116 if I recall (too lazy to look it up)

current_call · on Aug 4, 2016

That doesn't mean I can't want predictable performance or deterministic destruction.

You're assuming your compiler or operating system won't cause memory to be freed at different times.

I also think it's a shame that we waste so much electricity and rare earth minerals on keeping ourselves from screwing up.

Wasting man hours on manufactured problems is far worse than wasting coal.

anp · on Aug 5, 2016

Memory isn't the only resource managed by deterministic destruction.

Manufactured problems? When what's now coastline is underwater, I'll be glad to see if you remain as smug.

paulddraper · on Aug 4, 2016

GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java.

Beyond that, however, there are many uses for ownership beyond controlling memory resources. Closing a TCP connection, releasing a OpenGL texture...there are lots of applications of having life cycles built in to the code rather than the runtime.

EDIT: fixed typo

ynniv · on Aug 4, 2016

just look at hand the performance articles on Java.

Just look at the hand performance articles on C... People talk about it because you can do it, not because you have to do it.

paulddraper · on Aug 4, 2016

And you can do it because someone found that it was necessary to do.

It's crazy I can't tell Java and NodeJS "use the memory you need". Instead I have to specify max memory sizes (and then watch as they inevitably consume all of it).

wtetzner · on Aug 4, 2016

Of course they will consume all of it. That's how GCs typically work. They won't invoke a collection until there's no space left to allocate. Just because they "use all of it" doesn't mean all of that memory is actually live. It just hasn't done a collection yet.

paulddraper · on Aug 4, 2016

> Of course they will consume all of it. That's how GCs typically work.

Perfect agreement.

oldmanjay · on Aug 4, 2016

The disagreement probably lies in your characterization of the GC behavior as "crazy" when it merely doesn't suit your tastes.

renox · on Aug 4, 2016

> It's crazy I can't tell Java and NodeJS "use the memory you need".

Define the 'memory you need'? You know the computer doesn't have a cristal ball to know what latency vs memory usage trade off you want..

aianus · on Aug 4, 2016

"Until the OS refuses to give you more"

Just like every compiled, non-GC program gets. It's annoying af to fiddle with interpreter/VM "maximum heap sizes".

munificent · on Aug 4, 2016

Alas, "until the OS refuses to give you more" is also not a meaningful signal anymore.

https://en.wikipedia.org/wiki/Memory_overcommitment

anp · on Aug 4, 2016

Tracing garbage collectors have to have some metric for when to trigger a stop-the-world collection. IIRC, many GCs track heap usage and trigger stop-the-world when it gets above a threshold. GC'd language runtimes which don't have a tracing/compacting collector (e.g. CPython with refcounting) don't need to configure a heap size because the behavior doesn't vary as you use up your heap.

aianus · on Aug 4, 2016

That's still no good reason to throw all kinds of "Out Of Memory" errors when you're hitting 1GB out of the 16GB of physical memory on my system.

mike_hearn · on Aug 5, 2016

That's what most GCs do set the default max heap size to, assuming you don't have a swap file.

happyslobro · on Aug 4, 2016

I wonder, could we just tell each JVM instance that it may use all of the memory on the system, and then let the OS kill the first VM that allocates more than the system has to offer? Would this get us the same semantics as those of a native application? Or does the JVM preallocate all of the memory that it is allowed to use?

mcosta · on Aug 4, 2016

The JVM allocates at start a portion, takes what it needs whenever it needs until the max, but never releases memory back to the OS. So in a given moment a 6GB vm is a 5GB process but internally is using just 3GB.

jerven · on Aug 8, 2016

The standard JVMs do give memory back. The standard settings are not very friendly to do so but it does work.

mike_hearn · on Aug 5, 2016

As of one of the most recent Java releases, I believe the G1 GC does this on Windows. G1 is not the default but can be selected with a single command line flag.

mcosta · on Aug 4, 2016

A native application on what OS?

yoklov · on Aug 4, 2016

Most of the articles I see on C performance tuning are generalizable to any programming language, GC or no. Stuff like maintaining cache coherency, avoiding false sharing for threaded code, etc.

I don't write java, but my impression is the articles being talked about are much more java specific than the C ones are (C specific).

Narishma · on Aug 4, 2016

Cache locality, not coherency.

vosper · on Aug 4, 2016

Sometimes GCs require a lot of tuning, but I'd bet that 90% of software written in GC'd languages works just fine, without touching the GC settings.

In my limited experience writing performance-critical Python code the improvements always came from choosing better libraries (eg for faster serializations) or improving our own code. The GC never showed up in profiling as an issue for us.

mike_hearn · on Aug 5, 2016

Python is so slow its pseudo-GC is never the bottleneck anyway.

chrislgrigg · on Aug 4, 2016

This is all true for some languages but I think it it's far from universal. It seems to me that in many of the most frequently used and taught modern languages, the nearest most devs will come to being concerned with GC is an awareness of why object allocation should be minimized. For their purposes, that is a much better use of time than giving any consideration to GC tuning.

_ak · on Aug 4, 2016

Go's GC has a single setting that can be tuned: https://golang.org/pkg/runtime/debug/#SetGCPercent

And yet, it performs great, including predictable STW latencies, all with a relatively simple and straightforward algorithm. With that in mind, the Java GC's manifold ways of tuning the GC in all its aspects for minimal performance improvements sound more like something that was purposely built as something people can build their livelihood upon by providing consulting services, rather than something that was built for the best performance possible for everyone.

pmelendez · on Aug 4, 2016

"GCs are complex and require lots of end-user tuning --"

And they tend to be very memory hungry. Often, the memory overhead is the difference between running a program or having a bunch of browser's tabs open.

RX14 · on Aug 4, 2016

The GCs used in go or crystal tend not to eat memory more than 10% more than the peak memory of a c implementation. The perception of GC == hundreds of MBs of memory usage comes from java, where the GC aggressively preallocates, and has the memory baggage of a whole vm too. I rarely see crystal or go programs use excesses of memory.

pmelendez · on Aug 4, 2016

I can't talk about Crystal and Go because the lack of experience with them. But the described issue is not only for Java (i.e I have seen it in Node too).

Since memory deallocation is not deterministic, there have to be a tradeoff between lazy scheduling (which increase memory consumption) or frequent scheduling (which has a performance overhead).

You can do a fine tuning between those variables but that means that a high performant with a low memory footprint system is a very challenging thing to make using a tracing garbage collection (the ones in Java and Node).

atombender · on Aug 4, 2016

Java needs a sophisticated GC particularly because its OO model requires that it allocate an extraordinary amount of small objects, especially as things are boxed and unboxed. Ruby, one of the few true "everything is an object" languages, also suffers from an explosion of tiny objects. Node.js/V8 seems to suffer from this to a lesser extent, although it also ends up being a very memory-hungry language.

Go has been able to perform well with a simple GC because it doesn't suffer from this problem.

user5994461 · on Aug 4, 2016

"GCs are complex and require lots of end-user tuning -- just look at the performance articles on Java."

These articles are bullshit. Most settings are either obsolete or forcing the default. The rest is just useless.

I spent months doing performance tuning of applications stacks which were using Java (for app, database or both). Most of the settings are useless and barely change +-1% in performance.

The JVM has had good defaults for a while. The only thing one MUST configure is the -Xmn and -Xmx options to set the maximum amount of memory allocated to the java process (both settings to the same value).

mike_hearn · on Aug 5, 2016

> If you're talking about huge projects with heavy performance constraints (os kernels, AAA games and browsers come to mind)

Actually two of your three examples are no longer correct: game engines often use a core GCd heap because that's how Unreal Engine works since v3, and Chrome has switched to using garbage collection in the core Blink renderer as well. The GC project is called oilpan.

The benefits of GC are so huge, that they're used even for very latency and resource sensitive apps like browsers and AAA games.

mmargerum · on Aug 4, 2016

In the era of cloud computing, memory usage and performance are as important as they have ever been. If you can rent a smaller instance to do the same job that it really money savings.

uptownfunk · on Aug 4, 2016

> Just face it: there is an need for something intermediate to fill the gap of a script-like, native compiled, low-overhead, modern language, and a GC is part of this. The popularity and "I want to be cool so I hate it" trend of Go proves this, but the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.

Could the answer be lbstanza when it gets there? Lbstanza.org

pjmlp · on Aug 4, 2016

> the devops space is getting new useful cool toys at a breakneck speed, pretty much exclusively written in Go.

This is why I came to peace with Go's way of life.

Way better to push for less code being written in C, than argue about the language design decisions.

thinkpad20 · on Aug 4, 2016

Manual or deterministic memory management might be a must-have for certain usage domains, but for any domain in which one would be using ruby, this seems unlikely, and presumably one could FFI into C when this is the case. There are hardly any languages commonly used in industry which don't have GC (essentially just C/C++). And many of these garbage-collected languages are capable of blazingly fast code with a small memory footprint.

Regardless, for a language which is meant to operate in the same domain as ruby and be as easy and declarative, not having a GC would be a puzzling decision.

As a side note, I'm curious what areas you are programming in where the presence of a GC is such a downside. Having written almost exclusively in garbage-collected languages over the last few years, it's something I almost never think about (and happy not to). Of course I don't deny that stricter memory control is sometimes necessary.

fauigerzigerk · on Aug 4, 2016

Crystal seems to be targeted at a domain where ruby is not fast enough. That includes domains where GC is a problem.

A tracing GC means that you either have to deal with potentially long GC pauses or you need a lot of extra free memory at all times to give the GC time to catch up before running out of memory [1].

Go says it can achieve 10ms max pause time using 20% of your CPU cores provided you give it 100% extra memory. In other words, memory utilisation must be kept below 50%.

Cloud/VPS prices scale roughly linearily with memory usage. So using a tracing GC doubles your hardeware costs. Whether or not that is cheap depends entirely on what share of your costs is hardware cost and how much productivity gain you expect from using a tracing GC.

I would be very interested in learning how much CPU and memory overhead Swift's reference counting has, because in terms of productivity Swift is certainly competitive compared to languages using a tracing GC.

[1] Azul can do pauseless, but I don't know exactly what tradeoffs their approach makes. Their price is too high for me to even care.

yxhuvud · on Aug 4, 2016

Note though that a lot of the problems with GC in crystal can be worked around by replacing classes with structs. The latter are passed by value and allocated on the stack. There is also access to pointers and manual allocation if that should be needed (though that will end up with roughly the same lack of memory safety as in C) to optimize a hotspot.

jerven · on Aug 4, 2016

For the JVM Shenandoah GC [1] can do so as well (or at least very low consistent pauses) and is available via EA builds or the OpenJDK in fedora 24 [2].

This is with pointer happy java code, not with special effort to have pointer less data.

[1] http://openjdk.java.net/jeps/189 [2] https://fedoraproject.org/wiki/Changes/Shenandoah

quotemstr · on Aug 4, 2016

> The key to performing concurrent evacuation is having the Java Threads and the GC threads agree on the location of objects. This is accomplished in Shenandoah by the use of a Brooks forwarding pointer. All reads by the Java Threads indirect through this forwarding pointer. All writes to objects in targeted regions must first copy the object and then write to the object in its new location.

I'm a bit surprised that indirection is efficient enough to be worth the trouble (since you need reads and writes to branch for the indirected-object case), but I can't argue with results.

mike_hearn · on Aug 5, 2016

Bear in mind that's a conceptual description. The read/write barriers are implemented as compiler node graphs and thus can be optimised.

Taek · on Aug 4, 2016

If you are on a server do you need 10ms max pause time? For most applications running go on a remote machine, 25ms should be in the realm of acceptable.

sitkack · on Aug 4, 2016

http://latencytipoftheday.blogspot.com/2014/06/latencytipoft...

fauigerzigerk · on Aug 4, 2016

True, but I don't know how much that would buy us in terms of memory utilisation and CPU usage.

krzat · on Aug 4, 2016

People want non-GC language because everyone already has GC language that their are comfortable with.

So basically C/C++ replacement is the only niche that is left to fill. It would be even better if new language could replace even GC-languages, so I can can write fast low level libraries or websites in single language, without sacrificing productivity. That would be the Holy Grail I guess.

kminehart · on Aug 4, 2016

On the other hand, Go has an excellent garbage collector, and really seems to fill the niche for low-level libraries and programs.

It's also quite a joy to program in, and I'm a JavaScript developer, so I'm coming from the other side of the spectrum.

tyoverby · on Aug 4, 2016

I'd argue that Go is much closer to Javascript than it is to C/C++.

pjmlp · on Aug 4, 2016

I think Niklaus Wirth would disagree.

lmm · on Aug 4, 2016

Have you ever used a language with a decent type system though? I couldn't stand to use Javascript or Go at this point.

kminehart · on Aug 4, 2016

Of course. I've used C pretty extensively, Java enough to hate it, and C#.

I consider myself a javascript developer because that's what I've done my best work in and that's what I enjoy the most.

lmm · on Aug 4, 2016

I wouldn't really consider any of those decent (certainly not C or Java), unless C# got sum types when I wasn't looking.

banachtarski · on Aug 4, 2016

He said decent type system, not static one.

smnplk · on Aug 4, 2016

How can someone enjoy javascript ? Even closures are not done right and now they added classes. Just what we need sigh

kminehart · on Aug 4, 2016

I agree, classes are silly in JavaScript. It just masks the prototype and creates ambiguity; using the prototype effectively is part of being a good JavaScript developer.

I understand the hate and everything but honestly I think it presents a fun and refreshing way of solving problems.

Also npm is pretty awesome, aside from how massive the node_modules folder gets.

kminehart · on Aug 4, 2016

C, Java, and C# have decent type systems, especially compared to JavaScript.

smnplk · on Aug 4, 2016

They wouldn't say C is "weakly typed" if its type system was decent :)

lmm · on Aug 5, 2016

No sum types, no higher-kinded types, huge verbosity in declaring a new type. I'm talking something like OCaml/F#/Scala/Haskell.

mike_hearn · on Aug 5, 2016

The type systems of Java and C# are OK. C is not even in the same league.

ghayes · on Aug 4, 2016

One of the largest areas of concern is for real-time systems (systems which fail if they do not respond within some small time threshold). Most GC involves stopping the world to perform the GC which can pause your program's execution for some number of milliseconds. If GC pauses exceed your real-time requirements, you're out of luck.

Some languages, like erlang, do slightly better by garbage collecting erlang processes individually, so other erlang processes can continue running during GC.

Matthias247 · on Aug 4, 2016

For hard realtime systems it actually doesn't matter anymore. These are mostly implemented with a "don't allocate at all" strategy, since every allocation is not determistic. Therefore things are mostly statically allocated and bounded. And maybe there are some objects pools are around. You can do this in Go in the same way as in C. The only question there is if the GC will still run in the respective languages if no allocations happen through the user (e.g. because the runtime could do allocations in the background for it's housekeeping).

mike_hearn · on Aug 5, 2016

Yeah, the idea that malloc/free "don't pause" can only be based on not understanding how mallocs actually work. Advanced mallocs are often even multi-threaded and do some work on background threads.

Hard-realtime is always "allocate everything up front". It has to be. Allocation of dynamic sizes is not a problem you can make fully deterministic.

Svenskunganka · on Aug 4, 2016

This is known, but it's always possible to overcome these situations and its part of the language maturing. Golang has already had its run with optimizing their GC for real-time systems. Twitch uses Golang for their IRC chat, and they've taken the Golang GC on a journey which you can read about here: https://blog.twitch.tv/gos-march-to-low-latency-gc-a6fa96f06...

Crystal will at some point also be forced to optimize their GC for these cases, although it currently uses an out-of-the-box GC called Boehm-Demers-Weiser conservative garbage collector http://www.hboehm.info/gc/ which they have acknowledged they need to replace sooner or later.

pcr0 · on Aug 4, 2016

The parent comment is referring to hard real-time systems (where not responding within a certain timeframe would lead to catastrophy). We're talking things like pacemakers, anti-lock brakes, industrial control systems.

Regardless of how good GC is you would never use it in a hard real-time system because it is non-deterministic. IRC chat is only soft real-time.

adrianN · on Aug 4, 2016

Those kinds of systems won't get compilers for something other than C/C++ or maybe Ada for a long time. Usually you're stuck with a compiler from the chip vendor that kinda-sorta supports C.

lomnakkus · on Aug 4, 2016

Indeed, and you'll probably also need a special-purpose RT OS stack (or have to write your own/go without).

EDIT: I'd also add that this is such a niche[1] area of programming that expecting any mainstream language to meaningfully support it is... optimisitic and that mainstream languages shouldn't try to support it. (Soft real-time may be reasonable, but I believe that can be achieved with GC as demonstrated by the Azul JVM.)

[1] Niche, but obviously important, but perhaps not lucrative enough for anything to displace C or perhaps Ada -- given that these industries tend to be extremely conservative. (I wonder if ATS is used, though. Can't claim it's pretty, but proof seems like it would be a good thing for these systems?)

pjmlp · on Aug 4, 2016

Special versions of Java also get into this niche. Check the PTC Java compilers or Websphere Real Time one from IBM.

Before Aonix got bought by PTC they had a few contracts for weapons control for missiles and battle cruisers.

imtringued · on Aug 4, 2016

In those situations you wouldn't even use malloc and simply allocate everything statically.

rm445 · on Aug 4, 2016

Just as a point of interest, I believe there are special forms of garbage collector that are suitable for hard real time systems.

The principle is to regularly use a bounded amount of time for collecting - in line with the latency requirements for the whole system. I think the relevant term is 'tick tock', as in tick - compute, tock - collect.

friendzis · on Aug 4, 2016

The thing about hard real time systems is that they must be predictable, which is quite wide term. Predictable memory utilization, predictable computation cost, predictable response time. In an attempt to at least fit into these requirements GC must be "passive", on-demand i.e. callable from code. Even with bounded collect times, number of collectings must be predictable/controlled to predict computational cost/time of code paths. And that becomes not much different from manual memory management.

_pmf_ · on Aug 4, 2016

> Predictable memory utilization, predictable computation cost, predictable response time.

All of these can happen with non-GCd languages through heap fragmentation (i.e. even when correctly allocating and deallocating memory, you can still end up with a fragmented heap.) Tho only way to aviod this is to avoid all dynamic allocation (which is indeed done in a lot of systems) or exclusively use memory pools instead of a traditional heap.

chaoxu · on Aug 4, 2016

If there is no need for predicting memory utilization, then doesn't real time GC fit the bill? Consider all your know w/e you want execute in time T. A real time GC make sure it always execute in time 2T. For any sequence of operations.

Svenskunganka · on Aug 4, 2016

Ah, I see what you mean. Thank you for clarifying!

bjourne · on Aug 4, 2016

Any program running on any non-realtime OS can stop for any number of milliseconds. There are a billion reasons for such interruptions like other processes wanting to run, cleanup phases internal to the OS, memory paging...

If your program stops for 50 ms, do you really care if it was because of a GC cycle or something else? If you really do care, then you are not allowed to target Linux, Windows or OS X all of which are decidedly not real time operating systems.

seren · on Aug 4, 2016

To answer your last question, Hard real time embedded systems are everywhere in Robotics, Aerospace, Telecommunications, Automotive, Medical devices..

The real time capabilities are not always done in pure SW, there are some FPGAs, but when you do rely on SW, you often can not afford to spend even a few milliseconds in GC. In some case, that would mean killing or maiming someone.

And you are often tied to the HW vendor toolchain for a specific DSP, MCU,.. that is only supporting C or C++. This is a domain that is moving very slowly, currently my most optimistic time table would be able to have vendor support for Rust toolchain in 10 or 15 years but I don't foresee any GC language coming to replace the critical part written today in C or C++.

exDM69 · on Aug 4, 2016

Even soft real time systems like games or real time networking solutions suffer from non-deterministic GC pauses. Audio processing is another example.

You can get by in a GC'd system if you're careful not to allocate while being in the "hot path", but it's much more difficult than manual memory management (you need to know the internals of the GC algorithm) and interference from other threads may spoil your hard work.

Minecraft is a prime example of annoying (Java) GC pauses causing annoying interruptions. Another one is Kerbal Space Program's choppy audio (from C#/Mono GC). Although these games made millions or billions of dollars regardless, so you might argue it's a non-issue.

> currently my most optimistic time table would be able to have vendor support for Rust toolchain in 10 or 15 years

Not sure how much you'd need changes for the Rust compiler to be able to use it on MCUs and DSPs, but LLVM is more and more common and it might be (almost) enough to have the LLVM backend ported to the target arch. LLVM is moving fast, so for some targets it might be viable much sooner than your estimate.

seren · on Aug 4, 2016

It was not clear from my post, but what I would like to have is a RTOS with an implementation of the Rust std modules, to be able to develop an application on top of it. As far as I know, no one is working on that.

CyberDildonics · on Aug 4, 2016

If you write programs with a gc there is always a level which you can't get your hands on to change. If you want to create a piece of a puzzle that is tiny, does one thing, and does it right, like a shared library, a command, a linkable object, or any tiny standalone binary, the gc is always a thorn.

Anything where memory or interactivity needs to be tightly controlled is problematic with a gc. Not only that, but a gc doesn't scale as well with lots of threads. Ultimately you need thread local allocation since you will eventually be bottlenecked by the fact that typical allocation (with malloc, VirtualAlloc, mmap, etc) is protected by a mutex, and deallocation suffers the same fate.

wbl · on Aug 4, 2016

Except that garbage collected languages generally use per-thread nurseries, so the fast path is a small number of instructions. Having a GC also makes lock-free programming easier.

m_mueller · on Aug 4, 2016

Obj-C? Or do you count ARC as a GC?

exDM69 · on Aug 4, 2016

Whether reference counting is GC or not is arguing about semantics.

But correctly implemented reference counting is essentially pause-free. It's consistently "slow", which is better for some cases that unpredictably "fast".

m_mueller · on Aug 4, 2016

IMO it's not just semantics, rather, GC is too general of a term if it includes ARC. In terms of performance analysis the two are vastly different. One has basically an unbounded worst case but a good average case, the other is the opposite.

quotemstr · on Aug 4, 2016

Reference counting also has an unbounded worst case: what if you drop the last reference to a very large graph of objects? Then you free the whole thing, which can take an arbitrarily large amount of time.

m_mueller · on Aug 4, 2016

The main difference there is: You can control the timing of when to pay that penalty. With a full blown GC, if you run into performance issues because of it, you basically have to rearchtect the whole app (with something like memory pools, which you pay by needing more RAM than strictly necessary and with a vastly more complex code). With ARC I can track down a slow memory operation to a single line of code and deal with it there (e.g. by moving the complex object into a singleton). IMO a full GC is just the wrong level of abstraction for anything that's timing relevant, which includes all UI threads.

quotemstr · on Aug 4, 2016

> IMO a full GC is just the wrong level of abstraction for anything that's timing relevant, which includes all UI threads.

Hard real-time GC systems exist. In these systems, you can prove that pauses last no longer than a certain number of milliseconds. They're definitely applicable to programs with UI.

Can you prove that dropping a reference doesn't free an arbitrarily large number of objects? You can probably convince yourself in specific cases for specific programs that you don't see arbitrarily large refcount-release times, but any change you make to the code might invalidate this analysis.

A hard real-time GC stays hard real time.

m_mueller · on Aug 4, 2016

I keep hearing about these, but which of the popular GC languages (read: lots of library support) have real time GC? Aren't we talking about industrial RT applications rather than GUIs?

SamReidHughes · on Aug 4, 2016

You don't have to free the whole thing right away, you (the implementation) can free O(1) things and put the rest on a list to process later.

pjmlp · on Aug 4, 2016

Any good CS book about GC includes RC as a GC algorithm.

For example, the classical "The Garbage Collection Handbook"

http://gchandbook.org/

Ygg2 · on Aug 4, 2016

Reference counting is GC. Simple, but GC.

anp · on Aug 4, 2016

I agree that to compete with Ruby ergonomics one probably needs a GC. I think part of what I'm getting at is that there are other ways to approach these problems, and aping Ruby isn't necessarily one I prefer. Not to knock Crystal, it seems very cool.

Re: application domains, I've recently been doing some work in CPU/memory constrained applications (not embedded, running big >500GB jobs on HPC clusters), and a GC is unfortunately a non-starter for this kind of data processing.

I have also been watching with great anticipation the work being done on "big data" processing with Rust (https://github.com/frankmcsherry/timely-dataflow) and how that might obviate the need for a GC with the various JVM RAM-hogs which dominate that field.

There are also many areas where people work (many of whom provide the tools that programmers of GC'd languages use for their jobs) which can't admit a garbage collector.

For example, I currently deploy Django code (running on an interpreter that needs to implement, not run on top of, a GC) to a machine with a Linux kernel, running nginx, backed by another machine running PostgreSQL, with caching in Redis. None of those very important tools can reasonably offer the performance needed in a garbage collected language.

For another example, I'm typing this (quite lengthy) response in a low-latency application (a browser) which would also be difficult to implement in a garbage-collected language.

mike_hearn · on Aug 5, 2016

Er, browsers are often implemented in GC languages.

Firefox does the entire UI in XUL and Javascript: all GCd.

Chrome uses a GCd heap for much of its C++ code, and of course the web pages themselves are fully GCd.

irq11 · on Aug 4, 2016

for any domain using ruby?

you mean like webservers, where GC has been the #1 cause of operational problems for essentially forever?

exDM69 · on Aug 4, 2016

About GC: it would be nice if there would be some kind of standard-ish implementation framework for a GC in an LLVM language.

LLVM has been enabling fantastic new programming languages, and while it has support for a GC, I have not found a GC library that would be easy to embed in a new compiler/runtime environment.

Now there are dozens of LLVM-based languages (or language prototypes) that have different, incompatible implementations of GC with varying degrees of quality. If there was a relatively simple but efficient GC available, it would be much easier to implement a new language on LLVM.

At one point there was a project called HLVM, but it was targetted at implementing JVM and .NET -style virtual machines. This is not what I'm looking for and I think the project is dead now.

If anyone knows about a GC implementation for LLVM, I'd really like to take a look. If it's a part of a programming language project but would be relatively easy to rip out of the rest of the compiler/runtime, it's not a problem.

That said, I prefer languages without GC.

pjmlp · on Aug 4, 2016

It is very hard to have a general purpose GC library, because the best GC algorithms require a tight cooperation between compiler, GC and language semantics.

For me, the only viable alternative to GC are substructural type systems like in Rust's case.

exDM69 · on Aug 4, 2016

You're definitely right and it's not an easy task.

However, I think there's a sweet spot where you could implement a fairly nice boilerplate/framework that would be an 80% solution to the problem which would be a vast improvement over the current state.

The missing 20% would be language specifics and that would be either solved by forking the boilerplate code or writing some kind of callbacks for discovering references given a root object.

edit: Additionally, there's no simple example of using a GC with LLVM. It would be very helpful if there was, for example, a GC'd version of the Kaleidoscope language used in the LLVM tutorials. Even a trivial Lisp-style cons/car/cdr object system coupled with the simplest possible mark'n'sweep GC would be good.

pjmlp · on Aug 4, 2016

You should check projects like the The Mu Micro Virtual Machine and Eclipse OMR

http://microvm.github.io/

http://projects.eclipse.org/projects/technology.omr

The guys behind The Mu Micro Virtual Machine were the ones that did the L4 OS formal verification.

Eclipse OMR is based on IBM's J9, they are making it language agnostic to support PHP, Python, Ruby and whatever else one can think of.

However they aren't LLVM based and the caveats of a generic GC do still apply.

I also agree with you, the problem is how much those 20% actually are in terms of effort.

EDIT: typo where => were, on => one

anp · on Aug 4, 2016

The Boehm collector (http://www.hboehm.info/gc/) appears to offer what you're describing. In fact, it looks like it's what Crystal added in 2013 to implement its original garbage collection:

https://crystal-lang.org/2013/12/05/garbage-collector.html

exDM69 · on Aug 4, 2016

I'm familiar with it but it's not at all what I'm looking for. It's a library to "retrofit" GC into a C or other native program.

I'm looking for something to plug into LLVM's GC mechanisms to be used for new languages.

filereaper · on Aug 4, 2016

"About GC: it would be nice if there would be some kind of standard-ish implementation framework for a GC in an LLVM language."

Not quite LLVM, but take a look at the Eclipse OMR project.

OMR intends to provide a set of reuseable components like a GC, port-library and given more effort a jit to be reused into existing language runtimes or build a whole new language out of them.

https://github.com/eclipse/omr

mike_hearn · on Aug 5, 2016

I guess the question is, by the time you added high quality GC to LLVM, how different would it be to the JVM or .NET?

Bear in mind that .NET can do AOT compilation and the JVM is getting it (and some other non-OpenJDK JVMs already have it).

viraptor · on Aug 4, 2016

What do you mean by "sophisticated metaprogramming"? Rust has pretty sophisticated (sometimes I wish it was less) macro / compiler-plugin support.

anp · on Aug 4, 2016

Perhaps that's not the right way to phrase it. I guess I'm mostly thinking of compiler plug-ins which are very unstable right now. Which means that most users probably never write procedural macros in their own code. (I certainly don't)

lilyball · on Aug 4, 2016

There's also syntex (https://crates.io/crates/syntex), which basically provides compiler plugins for stable rust. It does so via code generation though.

JohnStrange · on Aug 4, 2016

I have no problem with GC, but I want to see reasonably complicated benchmarks that actually show that it's "fast as C". Because I don't believe that at all.

rogerdpack · on Aug 4, 2016

Are you saying the GC is as fast as C? I bet it's not. That being said, the programs I've built in crystal "feel" very fast, here are a few random performance tests, if you're asking about overall performance:

https://groups.google.com/forum/?fromgroups#!topic/crystal-l...

https://crystal-lang.org/2016/07/15/fibonacci-benchmark.html

igouy · on Aug 4, 2016

"but I want to see reasonably complicated benchmarks"

I very much doubt that a fib code-snippet is what he was asking about.

Koshkin · on Aug 4, 2016

The biggest problem with GC, it seems, is some sort of non-determinism that it introduces in the program's behavior. Otherwise, garbage collection, being 'lazy', is, in fact, more efficient way of releasing unused memory compared to how it is usually done in C and, especially, C++, where memory is released 'eagerly' (e.g. as part of the destructor), thus wasting precious machine cycles on something that may not be even necessary at all.

anp · on Aug 4, 2016

> wasting precious machine cycles on something that may not be even necessary at all.

I'm not familiar with very many scenarios where one has a garbage collector but doesn't need to free some piece of memory when it's no longer used. Could you clarify what you mean here?

bjourne · on Aug 4, 2016

For most GC algorithms (but not ref-counting), the time complexity is O(N) where N is the number of surviving objects (or it could be the number of surviving edges/references, I forgot!). For manual/deterministic/eager memory management, the complexity is O(N) where N is the number of allocated/freed objects.

So if the number of survivors << the number of allocated objects, which it always is in many functional languages, then GC can be faster than manual memory management. Especially if you use a copying GC algorithm which makes allocation extremely cheap.

Koshkin · on Aug 4, 2016

This is true for those compute job types which are mostly 'CPU bound' and which usually create most of the objects that they need at the very beginning; these objects would not be released until the job is finished anyway. I admit that in this case it may take some thought and deliberate effort on the programmer's part to avoid creating many short-lived objects.

jestar_jokin · on Aug 5, 2016

Nim[0] takes an interesting approach. It uses "deferred reference counting", effectively allowing GC cleanup to be spread out over a period of time. This at least helps with GC pauses.

It also seems to allow tweaking for soft realtime systems, e.g. games.

[0]http://nim-lang.org/docs/gc.html

lilyball · on Aug 4, 2016

Rust already lost most of its interesting sigils. I'm curious which sigils that are left you think should go away?

kbenson · on Aug 4, 2016

I've never understood people's abhorrence of sigils. They are useful shorthand for very specific concepts. It's like hating apostrophes in English. You can do away with them entirely, but I don't think most people would consider that an improvement.

Now, the greater density of concepts shorthand notation can be abused, and too much of that often shifts the cost benefit ratio further to the cost side for all but the most expert in the language, but that's a problem of too much, not on inherent with their use at all.

anp · on Aug 4, 2016

I don't personally think any of them should go away. I just note that many newcomers to the language feel they're opaque. Having written a good bit of rust now I do sometimes find the sigils impact readability (especially in macro_rules).

rogerdpack · on Aug 4, 2016

I think it's target is more along the space where "go" is, as little code as possible (like a scripting language), but still speedy. So in true scripting language form, you don't have to worry about collecting your objects ever. I'm not aware of any benchmarks on the cost/hit of this, I do know it uses the BDW GC which is hopefully pretty battle tested...

bitmapbrother · on Aug 4, 2016

There seems to be a lot of GC hate in this thread. I wonder how many people are aware that Unreal Engine, the dominant multiplatform game engine used to create the majority of AAA games, uses a GC.

hubert123 · on Aug 4, 2016

Yeah it's weird how Rust has suddenly made me look for no GC languages everywhere I look. It opened up a whole new desire to not accept no for an answer in that regard. I have this burning thought in the back of my head that there just has to be a simpler way to offer it than Rust does it too.

duaneb · on Aug 4, 2016

Memory management is difficult, extremely difficult, to get correct in the way rust does. I don't think I've seen a leak or bad dereference in years. The only way it really manages this is by tying references into what amounts to a proof assistant. Every simpler method of which I can think either sacrifices capability (e.g. no references at all; only raii + copy on write) or it becomes a GC with all its wonderful trade offs.

GC isn't terrible, though. Azul has struck an amazing balance between latency and eagerness—even if you can't afford it the technology does exist. If you don't have latency, memory restrictions, or embedding requirements, rust may be overkill.

quotemstr · on Aug 4, 2016

> Memory management is difficult, extremely difficult, to get correct in the way rust does

Memory management isn't hard --- you just need to pay attention to detail and not say "YOLO, let's abort on OOM" like the Rust stdlib does. Rust is an unacceptable language for anyone who cares about robustly responding to heap exhaustion.

Turing_Machine · on Aug 5, 2016

"Memory management isn't hard --- you just need to pay attention to detail"

You're quite right. The problem is that every bit of attention you spend on that detail is attention that you're not spending on details that are actually solving your problem.

I programmed in C for decades. I do not miss malloc() and free() in the least.

(I still do use C when the situation warrants, but the situations where it is warranted are becoming rarer and rarer with each passing year).

vardump · on Aug 4, 2016

> If you don't have latency, memory restrictions, or embedding requirements

Add power consumption to that list.

the_duke · on Aug 4, 2016

I think you meant

-> 'extreymely difficult to get INcorrect in the way rust does'

Ygg2 · on Aug 4, 2016

Unless this is a joke, its meaning is correct, i.e. "proven correct like Rust".

stavros · on Aug 4, 2016

I think he misunderstood the comment to mean "it's hard to get memory allocation wrong when writing a Rust program".

yamadapc · on Aug 4, 2016

The D Language (http://dlang.org/) has a @nogc pragma (https://dlang.org/spec/attribute.html#nogc).

You just annotate functions that you don't want to use the GC in with it and it'll assert that they don't use it.

p0nce · on Aug 4, 2016

Since you can:

- disable the GC

- deregister threads so that they are not stopped by GC

- eventually avoid the runtime altogether

There really is no realtime system that D can't do.

The whole anti-GC thing is a giant strawman that consider all GC stop-the-world, unavoidable, and overarching. Academia decided in favor of GC decades ago, and industry has been following suit for good reasons: mental overhead associated with finding owners to everything.

gpderetta · on Aug 4, 2016

It is not. Languages that are designed to run a GC get crippled when run without: depending on the language you might lose access to closures (if dynamically allocated), rich data structures (standard dictionaries, lists, etc), sometimes even aggregates (if boxed by default) and you have to resort to using arrays of primitive types.

As they say, you can write FORTRAN in any language, but you don't necessarily want to (this is unfair to modern fortran which I hear is actually a decent language).

jjnoakes · on Aug 4, 2016

You speak as if a GC is unconditionally better than alternatives and it is a solved problem but using a GC has issues as well.

On the theoretical side, not reasoning about ownership means sharing data betweent threads is done with copies (slower) or locking (slower and error prone); if you know about ownership you can share references to data while it can't be mutated for free.

Ownership is also important for any non-memory resource (file handles, mutexes, etc). GCs release those "whenever", maybe never, unless you close manually.

And even though manual memory management has some small non-deterninistic overhead for heap coalescing (which one can usually work around with pools), most GCs I've worked with add measurable overhead. This equates to more cost per server, more load, more battery life drained, higher response times...

p0nce · on Aug 4, 2016

> On the theoretical side, not reasoning about ownership means sharing data betweent threads is done with copies (slower) or locking (slower and error prone); if you know about ownership you can share references to data while it can't be mutated for free.

I don't think it follows and it's rather the reverse: it what I share has a global owner (ie. the GC), I don't have to lock or copy by definition: once it stops being reachable it will be collected. That's why some lockfree algorithms are enabled by the GC. With ownership you would have to have a unique owner, or reference counts. GC does require write barriers or stop-the-world though so let's say it's a draw :)

> Ownership is also important for any non-memory resource (file handles, mutexes, etc). GCs release those "whenever", maybe never, unless you close manually.

Yeah, it's a big problem that the GC even attempts poorly to close them. But D has scope guards and RAII builtin so for the 50% of non-memory resources you still have to think about ownership indeed. That's more complicated that the C++ situation. But realtime it does not prevent, you may well find yourself having more time to optimize :)

jjnoakes · on Aug 4, 2016

> if what I share has a global owner (ie. the GC), I don't have to lock or copy by definition

Then how you do avoid data races? Two shared references which can mutate your shared data requires either a copy, a lock, immutability, or a single writer.

p0nce · on Aug 4, 2016

I use "parallel foreach", sometimes worker queues, implicit single writer... like in C++. It sounds like you think only Rust-style ownership can avoid data-races. Sure, if you want the type system to do it. For me discipline is enough and I've seen it work in teams too. Not seeing such a problem really.

jjnoakes · on Aug 4, 2016

Nothing specific to Rust, although Rust encodes in the type system what I usually have to keep track of mentally, which is nice.

Trivially parallel algorithms do benefit from constructs like "parallel foreach" and implicit single writers, but in general, one either has to stick to those models (where the cognitive overhead is low but manageable) or if one ventures into more complex territory, one has to either deal with a higher mental complexity (ownership or locks), performance degradation (copying), or immutable data (if it fits your problem and doesn't decrease performance, win-win).

My argument is simply that GC doesn't fix everything, and the mental overhead of tracking ownership of memory (to me) isn't a huge burden, especially since I have to do it for non-memory resources and memory resources shared between threads already.

I'm not against a GC - I like languages that mix GC and non-GC side-by-side - because sometimes I do want to just forget about my memory, but only if it fits my problem domain. But I don't think GC beats non-GC hands-down for-all-cases.

bjz_ · on Aug 4, 2016

But then you lose memory safety

WalterBright · on Aug 4, 2016

That's being fixed.

Ygg2 · on Aug 4, 2016

Wouldn't that cause duplication of interfaces? Rust had that problem when it went down having no-gc and having gc, too.

WalterBright · on Aug 4, 2016

D's strong support for templates makes that a non-issue.

ori_b · on Aug 4, 2016

Yes. I'm currently trying to think of a good way to add it to Myrddin, which currently makes it more or less manual. Doing it in a simple way is a tough problem.

The simplest solution is to add the moral equivalent of 'null' -- objects that transition to an idempotently destructable state, which solves a lot of complexity with the data flow and analysis (yay!) at the cost of some safety (boo), and nulls (louder boo).

jesserayadkins2 · on Aug 4, 2016

My language (Lily) handles the problem by trying to avoid the gc where it can.

Lily is statically-typed, built-in classes can't be inherited from, and there's no C-like casting.

With those rules in mind, most objects can't become cyclical. It's impossible for a list of strings to loop back onto itself, for example. It helps that the value classes backing enums (like Option and Either) are immutable, which I so far suspect prevents a cycle.

That at least allows you to group classes into three groups:

These never cycle (Integer)

These may cycle (List)

These always cycle (Dynamic, linked lists?)

ori_b · on Aug 4, 2016

One of the goals I have is to keep the required runtime absolutely minimal, as well -- I'm ok with the compiler inserting some user-defined code in the appropriate places to initialize or release values, but I'd like to avoid growing the required code in https://github.com/oridb/mc/tree/master/rt unless it's absolutely necessary.

And yes, that ~60 lines per platform is really all that's needed. (And actually, I should be able to merge more of it for SysV platforms.)

So, I've thought about a GC, but I'd really prefer not to have it.

girvo · on Aug 4, 2016

Can I just say that Lily is amazingly neat, and is exactly the language I was working on myself! Seems we have a shared delusion ;)

Athas · on Aug 4, 2016

The claim is "fast as C", so I was surprised that the performance comparison was with Ruby, not with C. On my machine, the Ruby Fibonacci program executes in 47.5s, while a corresponding C program executes in 0.88s - that's a factor 54 difference, while the article reports a factor 35 for Crystal. That's good, but what causes the difference? This benchmark is pretty much all function call overhead, so I doubt it's representative of real performance-sensitive code.

The Crystal website itself makes a more modest claim than "fast as C" under its language goals: "Compile to efficient native code", which it clearly does.

bluetomcat · on Aug 4, 2016

All of these "fast as C" claims about modern, high-level Python-like languages (be they statically typed and natively compiled) are missing the point. It is mostly the minimalistic and terse programming style that C encourages that makes C programs performant. You avoid allocations wherever possible, you write your own custom allocators and memory pools for frequently allocated objects, you avoid copying stuff as much as possible. You craft your own data structures suited for the problem at hand, rather than using the standard "one size fits all" ones. Compare that to the "new this, new that" style of programming that's prevalent today.

progman · on Aug 4, 2016

There actually is one high-level Python-like language that really is almost "as fast as C".

http://nim-lang.org

I am using it for years already, and it is really performant, somewhere between C and Rust. I am still wondering why so few people use it.

Benchmark: https://github.com/kostya/benchmarks

Nim vs Rust: http://arthurtw.github.io/2015/01/12/quick-comparison-nim-vs...

Performance discussion: http://forum.nim-lang.org/t/2261

Embedded Nim: https://hookrace.net/blog/nim-binary-size/

Nim on LLVM: https://github.com/arnetheduck/nlvm

def- · on Aug 4, 2016

> I am still wondering why so few people use it.

While Nim is my favorite language, I can understand that it has a small userbase, for these reasons:

1. No major backer like Google for Go or Mozilla for Rust

2. No killer feature like "memory safety and performance without GC" for Rust, instead a mix of all the reasonable down-to-earth features I want in a programming language

3. Some unique decisions instead of what you're used to from other languages, for example partial case sensitivity

valarauca1 · on Aug 4, 2016

4. The fact that operator order is changed by the amount of white space between symbols

                2+2 * 5 = 20
                2 + 2 * 5 = 12

progman · on Aug 4, 2016

I am a strong proponent of Nim but this is probably the worst idea I have ever encountered in language development. Honestly!

Partial case sensitivity and the special underscore case are features I can live with. Unfortunately this has actually become a stumbling block for a wider adoption of Nim.

All strange special features should be optional, not default.

dom96 · on Aug 4, 2016

What makes you think this is default? It most certainly is not and will be removed completely in the future.

Edit: here is a source: http://nim-lang.org/docs/manual.html#syntax-strong-spaces ("... if the experimental parser directive #?strongSpaces is used..."). The last time this was discussed I said that it would be removed completely, and I still believe it will be. It's simply not a priority for us right now.

progman · on Aug 4, 2016

> What makes you think this is default?

Some of those features are default in Nim, some (strongspaces) are not. I say that all such weird features should be optional in general so that newcomers don't get scared off.

Also case and underscore should work like in C per default since Nim interoperates with C seamlessly anyway. Case insensitivity and ignoring underscore are ok if optional.

adrusi · on Aug 4, 2016

If you're consistent about how you space your infix operators, this will have no impact on your code.

If you put space around some operators and not around others, in a way that doesn't correspond to precedence, you're going to confuse anyone who reads your code, in any language.

valarauca1 · on Aug 5, 2016

Being confused by code is no excuse for code not doing what ya know... every other programming language has done for the past 50 years.

I get that trying new things, but somethings are pretty well agreed upon.

xutopia · on Aug 4, 2016

damn... I actually like that a lot!

qu4z-2 · on Aug 9, 2016

I like it for distinguishing homonym operators, but not for the precedence stuff they seem to have there. I'd like something like this though:

  let a = 10;
  a / 5
  output> 2
  let b = pwd();
  b/temp
  output> Directory<"~/temp">
  b / 2
  error> b:Directory does not implement method "divide(:number)"
  a/temp
  error> a:int does not implement method "get(:string)"

untothebreach · on Aug 4, 2016

I used both nim (back when it was still nimrod) and rust for a while, before eventually settling on rust. I tried to give nim a chance, and was told that "they will grow on you" ("they" being the things you mentioned that were "unique decisions instead of what you're used to from other languages"). They never did, and though I got used to avoiding the problems I initially had with them, the language just never "felt good" to me.

ericfrederich · on Aug 4, 2016

wow... have never heard of that partial case sensitivity before.

I think this goes beyond syntactic sugar. Holding the hand of the developer too much?

Personally, as a Python programmer I like interfacing with C++ code like Qt via PyQt. If I see a camelCase method I know where it came from, but if I see a PEP-8 style name or method I know it's our own code, not from Qt.

bluetomcat · on Aug 4, 2016

Shameless plug: is it considered totally uncool in 2016 for one to be developing a memory-unsafe, manual MM, non-OO, thread-denying language that preserves most of the C semantics?

https://github.com/bbu/quaint-lang

exDM69 · on Aug 4, 2016

No, definitely not!

I'd be very interested in a language that is roughly as low level as C, but has some obvious warts "fixed" while still being able to run on bare metal or with a minimal runtime system. I also don't care about a standard lib as long as I can call open(), close(), read(), write(), socket(), etc.

Native threads is another requirement for me.

Things I'd like to see in a language:

- compile to native executable

- type inference

- module system without header files

- easy to call into native C code, and export functions so they can be called from C or any other language

- first class SIMD structures (this is missing from Rust!), so that you don't have to duplicate code for sin4f and sin8f (which would be line-by-line equal, except types)

- perhaps some kind of modern polymorphism (ie. not class based OOP)

- can target GPUs via LLVM or SPIR-V

- memory safety is optional, but nice to have. I'd be mostly interested in using this kind of language for GPU kernels and tight inner loops, where you wouldn't be allocating anyways

I have a bunch of design ideas and prototypes in my drawer waiting for a lot of free time and inspiration appearing.

I like my tools sharp, even if it means there's going to be blood occasionally.

bluetomcat · on Aug 4, 2016

My next big endeavour with Quaint will be to create a clean module and linking system (without header files or any textual inclusions). Each source file will be transformed to a corresponding unit which contains code, data and exported type definitions. The linker would then merge these units and produce a native executable that runs your program in the self-hosted VM which will be a part of that executable. Pure native compilation or LLVM integration is too much of a hassle for me at this point.

One of the virtues of the language would also be the direct correspondence between the HLL code and the emitted VM instructions, without any optimisation passes. This makes it much easier to reason about code performance and to write code which performs consistently and predictably (albeit a bit slower).

pathsjs · on Aug 4, 2016

Nim fits everything you ask, except for "can target GPUs via LLVM or SPIR-V". Even that may eventually be fixed by having OpenCL C as a compilation target.

Also, I am not sure what you mean by "first class SIMD structures", but you can definitely have a single definition for sin4f and sin8f if they are line by line equal except types, by using union types.

exDM69 · on Aug 4, 2016

Nim is definitely on my short list of languages to learn, however...

Targetting GPUs is a deal-breaker. I'm sure the Nim compiler would be pretty easy to retarget to GPUs via SPIR-V (the new binary IR for Vulkan/OpenCL shaders and kernels) or OpenCL/CUDA C. But I don't think that would work for Nim's runtime system or existing Nim libraries (including any standard libs it has).

Also Nim's pauseless low latency automatic memory management (I guess you can call it a "GC") is very interesting but it's not what I'm after.

> Also, I am not sure what you mean by "first class SIMD structures",

I mean this:

    def multiply_and_add(a : <n x f32>, b : <n x f32>, c : <n x f32>) : <n x f32> {
        return (a*b) + c;
        // TODO: figure out how to use "madd" from FMA4 or NEON instruction set
    }

The trivial piece of code above should be "generic" so that it can be called with any width of vector.

Now the example above is very trivial but more complex examples might have challenges for correct implementation of the type checker. In particular, doing vector shuffles (ie. equivalent __builtin_shufflevector in GCC/Clang vector extensions) would need to have a strange type. Shader languages typically use a syntax like `myvector.wxzy`, which might work.

This might perhaps be possible with an ungodly mess of C++ templates and explicit template specialization for each vector type (and hoping that the compiler is aggressive enough in inlining). But I'm not really a fan of template-heavy C++.

In fact, the kind of solution I've been thinking about would be semantically similar to what I'd do with C++ templates.

> but you can definitely have a single definition for sin4f and sin8f if they are line by line equal except types, by using union types.

I'm not familiar enough with Nim's union types to be sure, but my guess is that this would not compile to efficient low level code apart from the most trivial of circumstances. This is my (not very) educated guess based on other high level languages with some concept of union types.

Anyway, Nim is a very cool language that I will check out sometime in the near future. It just isn't what I'm looking for my very specific use case.

pathsjs · on Aug 4, 2016

A union type in Nim can only be used in funciton arguments, and it does the obvious thing: when you actually call the function, it specializes to the type you are calling with. Think about templates in C++, where the type parameter can only assume one of two (or more) values. Hence it would generate exactly what you would write by hand, but the syntax is much less messy than C++ templates

roryokane · on Aug 4, 2016

Zig (https://github.com/andrewrk/zig), "a system programming language intended to replace C", has a lot of those features.

mastax · on Aug 4, 2016

You might also be interested in Jai [0] which has many of those things but is not a 'real language' yet or possibly ever. Lots of interesting ideas though.

[0]: https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md

CyberDildonics · on Aug 4, 2016

You've almost described ISPC verbatim. If you don't already know about it you might want to check it out. http://ispc.github.io/index.html

It doesn't compile to a native executable, but since it produces .o files you should be able to just set your entry point and go from there.

exDM69 · on Aug 4, 2016

> You've almost described ISPC verbatim.

Thanks, I've read about it before, but haven't spent too much time looking at it.

However, this "single program, multiple data" isn't exactly what I'm looking for (it would solve the sin4f vs. sin8f issue mentioned above, though). I need explicit, low level access to SIMD, coupled with genericity over vector widths. This means doing almost assembly-style SIMD code with explicit shuffles, blending, etc as well as access to intrinsics where needed.

I also need portability (ispc is from Intel, it probably doesn't support ARM NEON) and targetting GPUs.

I'm very well aware that my needs are very specific. I need to do math stuff for 3d graphics and physics applications.

All I need is for a lot of free time to appear from out of nowhere and I can write a prototype compiler for this myself :)

CyberDildonics · on Aug 4, 2016

How would you use specific instructions yet have widths abstracted?

ISPC actually has some preliminary support for targeting Nvidia PTX btw. It compiles using LLVM.

exDM69 · on Aug 4, 2016

See example above in this thread. In C + GCC vector extensions, I just use normal arithmetic operations (+, -, *, /).

However, when using specific intrinsics they are for a specific width. It might take some "library code" to take advantage of some instructions like dot products, etc.

tanlermin · on Aug 4, 2016

Julia has all these things

lmm · on Aug 4, 2016

Yes. Memory unsafety doesn't work.

3pt14159 · on Aug 4, 2016

I love and use Nim (in production). I wrote about why I don't think it's gained wide adoption in a previous post here: https://news.ycombinator.com/item?id=11960814

The only thing I would add would be that compared to Ruby, Nim still takes you quite a while to put something together, so defaulting to Ruby isn't necessarily a great idea.

Veedrac · on Aug 18, 2016

The first benchmark ("Nim vs Rust") I looked at says

> Rust regex! runs faster than Regex

which is a very old claim - Regex should now be much faster than regex! ever was. Any pre-1.0 Rust benchmarks are probably wrong (to be fair, most benchmarks are probably wrong anyway).

hetman · on Aug 4, 2016

All the benchmarks you posted suggest that Rust is faster than Nim, not the other way around. Did you have something else in mind?

steveklabnik · on Aug 4, 2016

A lot of them did show the other way, then came to Rust people's attention and improvements were submitted. As well as both languages' implementations changing over time.

steveklabnik · on Aug 4, 2016

Just a small note for anyone reading the "Nim vs Rust" post; today, regex! is much slower than Regex::new.

progman · on Aug 4, 2016

> You avoid allocations wherever possible

If you don't have to write cutting-edge games or embedded software for tiny systems, why do you have to care about allocations at all? Today's systems and RAM's are so fast that garbage collections don't really matter in most cases. Consider SBCL (compiled Common Lisp) which is almost as performant as Java and C++.

http://benchmarksgame.alioth.debian.org/u64q/lisp.html

I used to develop software in C and C++ for many years, and a garbage collector was the thing I wanted the most. GC-free programming is unnecessarily tough in most cases, except you desperately need it for games and embedded systems.

wott · on Aug 4, 2016

> If you don't have to write cutting-edge games or embedded software for tiny systems, why do you have to care about allocations at all? Today's systems and RAM's are so fast

What? RAM is not fast at all, the latencies have almost not improved in 20 years (compared to the improvement of other subsystems like the CPU, of course).

adrianN · on Aug 4, 2016

Because allocating something on the stack means adding a something to a pointer and puts the object somewhere that is likely to be in the cache during the function and allocating something on the heap usually takes a tree traversal and puts the object far away from the other stuff you might be using.

Also, using numbers from a benchmark game is not representative of the performance of real world applications. If you look at the code, you'll find that it's written in a style that avoids heap objects and GC wherever possible. Forcing heap allocation is what makes Java slightly slower than C in many cases.

bluetomcat · on Aug 4, 2016

You must worry about allocations anytime you write CPU-bound software which wants to maximise the output per clock cycle.

Granted, most software isn't like that, but it's certainly not only games and embedded software on tiny systems.

GFK_of_xmaspast · on Aug 4, 2016

RAII and smart pointers go a long way towards eliminating the need for GC in c++.

pjmlp · on Aug 4, 2016

True, but usually one doesn't have control over the code others write to force them to use such tools.

piaste · on Aug 4, 2016

Nitpick: everything you say is probably correct, but such performant C programming is also the very opposite of a "minimalistic and terse style".

Which one is more minimalistic, 'new Foo' or a collection of various custom-tuned allocation methods? Which one is more terse, 'myList.Where(foo).Select(bar).Aggregate(baz)' or an explicit for loop?

bluetomcat · on Aug 4, 2016

It is minimalistic in the sense that the language provides a narrow set of primitives and a skilled programmer combines these primitives in the most sensible way to solve the problem at hand. Higher level stuff in most other languages is much more generic.

Indeed, it may not be minimalistic in terms of the code size.

wott · on Aug 4, 2016

> All of these "fast as C" claims about modern, high-level Python-like languages (be they statically typed and natively compiled) are missing the point. It is mostly the minimalistic and terse programming style that C encourages that makes C programs performant. You avoid allocations wherever possible, you write your own custom allocators and memory pools for frequently allocated objects, you avoid copying stuff as much as possible. You craft your own data structures suited for the problem at hand, rather than using the standard "one size fits all" ones. Compare that to the "new this, new that" style of programming that's prevalent today.

Exactly! I cannot agree more.

I have a small test program I port to different languages to test the length of the code and the speed of the program. Of course it only represents a single use case.

* C is first, of course.

* twice as slow, come Pascal, D and... Crystal!

* x3 to x5, come Nim, Go, C++ (and Unicon).

* x6 to x9, come Tcl, Perl, BASIC (and Awk).

* x15 to x30, come Little, Falcon, Ruby and Python.

* x60 to x90, come Pike, C#, Bash.

* x600 to x1000, come Perl6 and Julia.

This list looks byzantine, I know :-) The trends I can get out of it:

* the last 2 are languages with JIT compilation, and that's horrid for short programs.

* the "old" interpreted (or whatever you name it nowadays) languages (Tcl, Perl) are not so bad compared to compiled languages, and much faster than "modern" one (Ruby, Python). (Again, this is only valid for my specific use.)

* compiled languages should all end up in the same ballpark, shouldn't they? Well, they don't. The more they offer nice data structures, the more you use them. The more they have some kind of functional style (I mean the tendency to create new variables all the time instead of modifying existing ones) the more you allocate and create and copy loads of data. In the end, being readable and idiomatic in those languages means being lazy and inefficient, but what's the point of using those languages if don't use what they offer? C forces you to use proper data structures and not re-use existing ones. It comes naturally. What is unnatural in C is to copy again and again the data, it is simpler to modify the existing one and work on the right parts of it, not to pass the whole chunks every time you need one single bit. In more evolved languages, compilation won't save you by doing some hypothetical magic tricks, it cannot remove the heavy continuous data copying and moving you instructed your program to do. And that is what made the difference in speed between C on one side, and D, C++, Go on the other side.