How does its parallelization feature work with lazy generators? Or does its seq ...

entilzha · on Dec 19, 2017

Seq doesn't return a generator. At its core there is a concept of Lineage taken from Apache Spark. Essentially, when you do something like seq(data).map(func) it builds on a list of operations to execute and holds a copy of the base data. When asked for a value only then will it compute the result.

So for parallelization operations that are embarassingly parallel (map/filter etc) are paralelized with multiprocessing. If the data is heavy and serialization is expensive it might be slow, but for operations where the bulk of the work is done in parallel it can help a lot.