Hacker News new | past | comments | ask | show | jobs | submit login

Yes, reducers provide that benefit to sequential reduce, independent of the parallelism of fold.



So then, what downsides, if any (I'm sure there are), would there be to moving this reducers model to being the "default"?


When you call a reducer like r/map or r/filter, the result is reducible but not seqable. So, if you have an incremental algorithm, are calling first etc, the seq-based fns are a better fit. Also, lazy fns automatically cache whereas reducers recalc every reduce, until you save the realized result. They are genuinely complementary in many ways.


1) What is the benefit of recalculated on every reduce? Is this so you can use side-effects?

2) If seq or first is called on a reducible, wouldn't it be easy to just implicitly realize the reducible in to a sequence first?


1) The benefit is that you don't have to cache the results in a data structure, which really slows it down. Suppose you map the function (fn [x] (+ x 1)) over a reducible, and then you sum it by reducing it with +. With reducibles, there is no intermediate allocation, and it will run really fast especially if the functions can be inlined. Compare this with mapping and then folding a lazy seq: map builds an intermediate data structure, and reduce immediately consumes it.

2) That's possible, but it makes it too easy to write code with abysmal performance because of (1). The common case is that you call both first and rest on the reducible. If both turn the reducible into a seq first, then both will take O(n) time in the best case (might be much worse depending on how the reducible was built up). Combine that with the fact that most times, you're going to recurse on the rest, and you've got an O(n^2) algorithm where you expected O(n), if everything is based on reducibles. Additionally, it's impossible to take the first or rest of an infinite reducible (well, perhaps you could do it with exceptions -- in general you can turn a reducible into a seq with continuations).


In the reducible model as described, it's not possible to implement some operations efficiently, like map on multiple collections (this is called zipWith in Haskell). Also, if you filer/map/etc a function over a reducible and then use the resulting reducible twice, it will cause the filter/map/etc to be performed twice. This can lead to performance surprises, for example it can turn a linear time algorithm into an exponential time algorithm in some cases. Thus this is not a drop-in replacement for seqs, although it can probably cover most uses more efficiently.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: