Ask HN: How dangerous is Clojure's immutability assumption?

cemerick · on March 6, 2010

After using clojure as my primary environment for ~ 18 months, and leaning heavily on existing Java libraries for a lot of foundational stuff, I'd say that this is a non-issue. Of course, you can get yourself into a lot of trouble in any environment, but (at least for me and other experienced Clojure programmers I've worked with and whose results I've looked at) it's rarely nonobvious where unrestrained (i.e. Java-related) mutability is in the mix -- and in those areas, you take all the usual precautions that you would if you were using those mutable libraries in Java.

The upshot of this is that if you're using a Java library, you'll generally want to either:

(a) build a clojure wrapper API so as to enforce some sane semantics on it (see clojure.contrib.http.agent for a good example of this, where HTTP interactions are wrapped in clojure agents and a good set of convenience functions that make working with the JDK's HttpURLConnection and friends way more pleasant than usual).

(b) confine the usage of key Java libraries in such a way that there's a clear line of demarcation between clojure's mutability and concurrency semantics and the free-for-all in the rest of Java. This is where the big win is in programming Swing interfaces, for example, where your core data model would ideally be implemented using persistent data structures and clojure's reference objects to ensure sane concurrency semantics, and you take all the usual precautions when touching the Swing APIs.

Estragon · on March 7, 2010

Thanks for the advice. Your approach sounds simple enough to follow, if you pursue it from the start.

dons · on March 6, 2010

In 10 years I've had maybe 5 bugs in Haskell caused by foreign language code mutating objects under the hood, and that breaking referential transparency guarantees in the Haskell code.

I would not consider this "dangerous". It's a side condition you'll need to check. The language can make this more or less easy to establish.

Typically it looks like your value is changing under the hood. It's relatively easy to debug -- since the result is so unexpected.

It's rare in Haskell. I imagine it is a bit more common in Clojure, where they rely more on Java code than Haskell does on C code.

I don't believe Clojure is an optimizing compiler -- it's not doing any optimizations based on static guarantees of referential transparency -- so that simplifies the issue. If the compiler can't guarantee purity, there's less it can do with to your code to take advantage of that, so less unusual semantics.

swannodette · on March 6, 2010

Clojure isn't an optimizing compiler because the JVM is the optimizing compiler. Clojure datastructures are all declared static final which allows the JVM to work a lot of magic.

I've been using Clojure for a year and half and never run into such a bug. You learn quickly that you lost all of the benefits of persistent datastructures if you're putting mutable objects into them.

As a side note, Rich Hickey has been working on a project called "cells" which allows you to use even unsafe mutable Java objects with the same safe concurrency guarantees as Clojure persistent data structures.

Confusion · on March 6, 2010

If I understand the problem correctly, then it's much like a problem that plagues most hashmap/associative array implementations: when an object that is used as a key is modified (without removing and re-adding it around the modification), you'll often not be able to find it again.

It takes some debugging to find this is happening, but in five years of writing Java and Python, I've only had it happen once in either language.

mark_l_watson · on March 6, 2010

I use a lot of my (sometimes ancient) Java code with both Clojure and Scala. So far I have always written wrappers that copy Java data into Clojure or Scala 'native' data types. If you do this then there is little chance of having problems like those you are concerned about. It might be more efficient to use Java types, but not worth the hassle in most cases. (I also use (J)Ruby a lot, and I have found the secret to happy use of (J)Ruby is to give up the desire for good run time performance :-)

jrockway · on March 6, 2010

I don't want to start a language war, here.

Then it's probably not a good idea to use the word "horrendous". Your commentary is not what would start a language war; your tone is.

barrkel · on March 6, 2010

Hash functions should generally hash based on a value's identity. Mutable objects passed around by reference have an identity independent of their value; mutable objects passed around by value, on the other hand, change their identity when they're modified.

For example, one list isn't equal to another list, even if it has the same elements, if modifying one list doesn't modify the other. If they're not the same, then they shouldn't compare as equal.

This is one reason I think Java's implementation of hashCode() on collection classes isn't very smart. I think .NET gets it right, having GetHashCode() return a consistent value for mutable collections. (Similar comments apply to the corresponding equality operation.)

But mutable objects passed around by value are bad for other reasons, such as the risk of modifying copies when you think you're modifying an underlying value.

jacquesm · on March 6, 2010

It sounds like it should be relatively easy to write a function that checks in case of doubt if any of the objects have their contents changed compared to their hashes.

You could enable something like that during the debugging phase of your development to get 'peace of mind' that such behaviour is not the source of any bugs.

jganetsk · on March 7, 2010

This is a problem in Java too. Any object can mutate, effectively breaking any ordered collections that hold it.