I think the signature of the map function is now the standard example of Scala b...

DannoHung · on June 1, 2012

Yeah, this is really a consequence of Scala trying to do much shit for you automatically with the map function. See, it's not just a map function, but also an auto-conversion function from any generic sequence to any other generic sequence.

What the hell does that mean? Okay, so basically, take your standard map function. Run it on the structure you're mapping over. Great, now you've got your updated structure! Now, you don't just want to return the same type of structure you just had, you also want it to be a different structure, but with those modified elements you had for its contents. Now build that new structure from the thing you just modified (Scala may actually optimize the traversal and construction of the new collection into one pass, although I am not positive). And there's your return value.

It actually gets more complicated due to the Liskov principle and the way that Java handles type erasure, but thinking about that shit just makes me fuckin' tired.

You want to know what the map type signature should look like if the Scala guys weren't going overboard on trying to make things seem outwardly simple?

    class Functor f where 
      fmap :: (a -> b) -> f a -> f b

quatrevingts · on June 1, 2012

It does indeed require some explanation to understand why this signature was chosen over a "simpler" signature that might have provoked less mockery from the interwebs. But it's not really that difficult, and it results in a collections library that's better than C#, Haskell, or Python. And I don't know about you, but I use collections a lot.

duaneb · on June 1, 2012

So, coming from haskell and java, that snippet looks like a nightmare. How is that better than `map :: (a -> b) -> a -> b`?

quatrevingts · on June 1, 2012

Well, if you're using Java, you don't have map in the JDK. You can use Guava (or write your own) to get Iterables.transform(). But then, anything you map turns into an Iterable. So often you have to iterate over the whole thing once again to get the data structure you actually need. You could write Collections.map(), Lists.map(), Sets.map(), etc, until you get tired. (Guava does not include these.)

But at this point you're still way behind Scala. What about flatMap, filter, slice, drop, scan, etc? Every collection in Scala has a huge set of methods that work on it. But they're not implemented again and again, because the type system is powerful enough to prove that the generic code that works for Set can be used for List. In contrast, every Java program is full of innumerable primitive operations expressed again and again, so you have to keep reading the steps of every for loop to determine, "oh, this one is doing forall. This one is map" rather than simply reading "forall", "map". Which is why, thanks to the above complexity, Scala can be much more readable than Java.

What about Haskell? The existing standard library is fairly weak on collections. There are not that many implementations, and there's no typeclass that quite captures the idea of a collection: you've got Foldable, Functor, Monad, and possibly some others, which all get slightly different aspects of it. Small typeclasses are nice in that they allow more instances, but often you need some functions that aren't there. There's no typeclass which captures the size() function, for instance. That's mostly a library design issue, which is more tractable than a language issue, but still pretty hard to fix.

There's one additional cool feature of Scala collections which I haven't touched on yet: transformations can result in collections that are different from the original collection type, and the most specific collection that is compatible is used. What does that mean? Say I map a TreeSet from an ordered element type to an unordered type. The result type can no longer be stored in a TreeSet, so you get a more generic Set collection instead. I'm curious what it would take to implement this in Haskell.

duaneb · on June 1, 2012

So ignoring Java and concentrating entirely on expressiveness & ignoring readability, maintainability, etc... It seems like Collections are the entire kitchen sink of algorithms that map across iterable content. How is that better than more precise types (functors, monoids, monads, foldables)?

For example: what is the benefit of reducing these data structures (and associated algorithms which operate on them) to more generic versions?

> There's one additional cool feature of Scala collections which I haven't touched on yet: transformations can result in collections that are different from the original collection type, and the most specific collection that is compatible is used. What does that mean? Say I map a TreeSet from an ordered element type to an unordered type. The result type can no longer be stored in a TreeSet, so you get a more generic Set collection instead. I'm curious what it would take to implement this in Haskell.

While I agree that this is "cool" that you can do that, I really don't see the use of it except when sketching out ideas when you're not sure what kind of data flow you want yet... but that seems like a problem with designing the program and should be avoided.

wandermatt · on June 1, 2012

In Scala 2.7 the signature for map, defined in Iterable[+A], was def map[B](f : (A) => B) : Iterable[B]

That seems fairly close to the Haskell signature, although somewhat less readable. However, in Scala 2.8, things got more complicated with the addition of the implicit. Consider two functions that I might apply to a collection of integers:

Set(1,2,3,4).map{ _2 } // Set[Int] = Set(2, 4, 6, 8)

Set(1,2,3,4).map{ _.toString } // Set[String] = Set(1, 2, 3, 4)

Nothing surprising here, we provide a method A->B and use it to map from Set[A] -> Set[B] Now lets use the optimized BitSet collection, which efficiently stores integers:

BitSet(1,2,3,4).map{ _2 } // BitSet = BitSet(2, 4, 6, 8)

BitSet(1,2,3,4).map{ _.toString } // Set[String] = Set(2, 4, 3, 1)

Notice that the collection type is BitSet for the function with return type of Int, but plain Set for the function with return type of String. This functionality is brought to you through the implicit parameter.

Natsu · on June 1, 2012

An asterisk got eaten in your post. You should start lines of code with a couple of spaces to make HN stop doing that.

  Set(1,2,3,4).map{ _ * 2 } // Set[Int] = Set(2, 4, 6, 8)

  BitSet(1,2,3,4).map{ _ * 2 } // BitSet = BitSet(2, 4, 6, 8)

zem · on June 1, 2012

i've never used scala, but it seems to me that that signature is simply ensuring that the type of map is as flexible as possible, and that actual specializations will be inferred by the compiler without the user having to actually specify any part of that.

michaelochurch · on June 1, 2012

That's correct.

Drbble · on June 1, 2012

Now look at the Java Generics FAQ.