Guido on the fate of Lambda, map() and filter(), and reduce() in Python 3000

pg · on June 24, 2008

"filter(P, S) is almost always written clearer as [x for x in S if P(x)]"

Is this widely believed by Python hackers?

etal · on June 24, 2008

If P is a built-in function taking a single argument, or one that's already been defined, no. That's often not the case, though, and using a simple lambda is annoying and usually less efficient the rest of the time.

For example,

    filter(isfile, os.listdir(some_dir))

is cleaner than:

    (f for f in os.listdir(some_dir) if isfile(f))

But

    filter(lambda f: f.endswith('.py'), os.listdir(some_dir))

is worse than:

    (f for f in os.listdir(some_dir) if f.endswith('.py'))

Ruby's blocks are something to be jealous of, in this case. List comprehensions and generator expressions have other merits, though.

shiro · on June 24, 2008

I think this is the case that the choice of the syntax (and a bit of semantics) did affect the choice of API. I'm not a Python programmer, but I can see in your second example that they would prefer comprehensions to filter.

If there were curring or some convenient partial application syntax in the language, using 'filter' can be more natural choice. In Gauche Scheme, pa$ is a built-in partial applicaiton and I'd write:

    (filter (pa$ string-suffix? ".py") (sys-readdir some-dir))

Or in Haskell:

    getDirectoryContents somedir >>= return . filter (isSuffixOf ".py")

In general I'd rather write this in Gauche, for a regexp object is applicable:

    (filter #/\.py$/ (sys-readdir some-dir))

So, if you have more variations than the bare lambda to create a higher order function (either in currying, partial application, or promoting objects to applicable something, etc), the more the functions such as filter, fold etc become useful. If you need to fall back to lambda for most of times, it would be natural that using those fns becomes annoying.

pg · on June 24, 2008

Arc's [] notation would help in the second case.

  filter([_.endswith('.py')], os.listdir(some_dir))

I've found that one of the most important things to eliminate from a program is unnecessary names. Introducing a variable is more than 1 token's worth of conceptual load.

eru · on June 24, 2008

I wish there was an easy syntax (in Python) for not only map and filter - but for something like reduce/foldr/foldl as well. You still have to write loops to do this (or use reduce.)

etal · on June 24, 2008

Me too. I think the introduction of the functional module will actually be good, because there's less resistance to adding clever features for higher-order functions in the standard library than there is in the core language (in Python, at least, since it aims to be small and beginner-friendly). Look at the itertools module to see how that turned out -- lots of useful tricks borrowed from the ML family, and you can "buy it by the yard".

ynd · on June 24, 2008

It's a question of habit. If you're experienced with Python List comprehensions, they become clearer than anything else.

"The main problem I see with Ruby is that the Principle of Least Surprise can lead you astray, as it did with implicit lexical scoping. The question is, whose surprise are you pessimizing? Experts are surprised by different things than beginners. People who are trying to grow small programs into large programs are surprised by different things than people who design their programs large to begin with.

For instance, I think it's a violation of the Beginner's Principle of Least Surprise to make everything an object. To a beginner, a number is just a number. A string is a string. They may well be objects as far as the computer is concerned, and it's even fine for experts to treat them as objects. But premature OO is a speed bump in the novice's onramp. " Larry Wall, http://interviews.slashdot.org/article.pl?sid=02/09/06/13432...

zenspider · on June 24, 2008

You're quoting Larry "I've done more than anyone to fuck up programming languages" Wall as an authority on programer clarity?!? Um... no?

here are some examples from real python (twisted)... let's see how much clearer they are than anything else:

  ''.join([''.join(['\x1b[' + str(n) + ch for n in ('', 2, 20, 200)]) for ch in 'BACD'])
  ''.join([''.join(['\x1b' + g + n for n in 'AB012']) for g in '()'])
  msgs = [os.path.join(b, mail.maildir._generateMaildirName()) for b in ('cur', 'new') for x in range(5)]

It is code like this that makes me run for the hills... Even ignoring the list comprehensions, I use join as an example of how python gets everything backwards...

  ary.join(', ')  # ruby   - oop: tell the array to join its elements with a str
  join(', ', ary) # perl   - fun: join ary with a str
  ', '.join(ary)  # python - wtf: tell the str to join the elements of some other ary?!?

ynd · on June 24, 2008

I'm quoting Larry Wall because he said something interesting.

The comprehensions from twisted were not written for clarity so they aren't good examples. The programmer wanted a one-liner to define these constants.

zenspider · on June 24, 2008

Asking Larry Wall about good OO/language/library design, or PoLS is like asking George Bush (either really) about effective foreign policy or human rights...

Even if they say something interesting, is it valuable information coming from that source? Should it be trusted?

zenspider · on June 24, 2008

"The comprehensions from twisted were not written for clarity so they aren't good examples"

no... they're real examples.

mleonhard · on June 24, 2008

string.join(ary, ', ') # python - ok: join ary with a str

The parameters to Python's join function are in a better order than Perl's. This is expected since Perl perverts the concept of function arity.

Also, twisted is not an example of the best Python code.

zenspider · on June 24, 2008

yes, they did add that form after a while... (so much for one right way to do things, eh?). It seems that people still prefer the `sep.join(ary)` form more based on the code reading/debugging I've done.

"better order" is subjective and I have to disagree in this case. perl's form isn't limited like pythons as it takes any number of values after the separator. Much like ruby's "* arg" (splat arg--HN's formatting is besting me) or lisp's &rest. Lisp's (+ ...), (< ...), etc are lovely because of this property.

Twisted is an example of some of the most popular python code out there... It is representative of real world python and is what drives me away from the language (and has me running screaming away from twisted).

kurtosis · on June 24, 2008

With filter you have to remember that the predicate is the first argument and the list is the second. With the list comprehension the syntactic cues make it obvious.

PieSquared · on June 24, 2008

In some cases, it might be, such as when you're combining map and filter. Then the list comprehension just looks like psuedocode: include this in the result if these conditions are met. In that example. though, it isn't much clearer. And the 'x for x' seems sort of repetitive.

mariorz · on June 24, 2008

I don't think the "x for x" is repetitive. It's there for the cases you mention. e.g.

[str(x) for x in range(100) if x%2 is 0]

VS

map(lambda x: str(x), filter(lambda x: x%2 is 0, range(100)))

ken · on June 24, 2008

Instead of lambda x: str(x), I'd just write str.

I think this, and the P/S example, are both symptomatic of how Python (anthropomorphized) wants you to think about functions: they're things you stick parens after and call, not things you pass around on their own.

mariorz · on June 24, 2008

I still think the list comprehension is clearer than:

map(str, filter(lambda x: x%2 is 0, range(100)))

apu · on June 24, 2008

I find comprehensions to be especially useful in two situations:

1. Transforming a list using some non-standard function, often with some basic input checking:

    [int(x)*2+1 for x in seq if x]

(Here, we're assuming x is a string which can possibly be empty or only blanks)

2. Transforming elements from multiple lists together:

    ['Iter %d took %0.2f secs' %  (i, t) for i, t in zip(iters, times) if t > 1.5]

(Where we want output strings only if some iteration took a long time)

Although I'm not too fluent in lisp, I think the equivalent formulations might be longer (since you'd have to write lambda, map, filter and many parens), and perhaps harder to decipher.

PieSquared · on June 24, 2008

Beat this short Arc snippet using map with a list comprehension:

  (def vector-dot-product (vector-1 vector-2)
    (map [* _1 _2] vector-1 vector-2))

That's from Light Makes Right the raytracer, I think. Equivalent python:

  def vectorDotProduct(vector1, vector2):
    return [x * y for x, y in zip(vector1, vector2)]

I'm pretty sure that the Lisp snippet above is neither harder to decipher nor longer. Though maybe that's just because Arc is nice.

apu · on June 24, 2008

Agreed. This example highlights the great savings of the [... _ ...] syntax in Arc (which I'd imagine is a fairly common use-case, especially for map, filter, etc.).

Although ironically in this case, I'd probably use the following:

    def vectorDotProduct(vector1, vector2):
        return map(op.mul, vector1, vector2)

(where op has been imported previously using:

    import operator as op

)

I'm not really sure how idiomatic this is, but I find that 90% of the time when I use map in python, it's with the operator module.

lsb · on June 24, 2008

in Haskell (and maybe ML), you might have

vectorDotProduct = map (*)

lincolnq · on June 24, 2008

You probably meant:

vectorDotProduct = zipWith ( * )

(and this doesn't even include the sum)

map ( * ) has type Num a => [a] -> [a -> a] which is not really what the dot product should be.

jrockway · on June 24, 2008

I think the idea is that [x for x in S if P(x)] is more general; it can do more than just filter. Knowing about filter and how it works will take up a small part of your mental capacity (that could be used for something else, perhaps), and all it can do is filter. Why learn two ways to do the same thing, when one way is more general?

I think that's what Python hackers think, anyway. I use Perl :P

aaronsw · on Aug 16, 2008

I've often wished that the "x for" could be optional, but yes.

It's like bar charts and tables. Tables communicate more information, but we're visual creatures and thus prefer bar charts.

newt0311 · on June 24, 2008

riobard · on June 24, 2008

I think the problem of list comprehension is that it is not as flexible as combination of map() and filter(). For example,

  [F(x) for x if P(x)]

is equivalent to

  map(F, filter(P, S))

but what about the equivalence of

  filter(P, map(F, S))?

Without using any map() and filter(), this is the solution came up to my mind

  [x for x in [F(y) for y in S] if P(x)]

Pretty ugly, huh?

bamsjakalaka · on June 27, 2008

[F(x) for x in S if P(F(x))]

riobard · on June 27, 2008

That means you have to do F(x) TWICE, which is sometimes ineffective (when F(x) costs too much)

apathy · on June 24, 2008

wait a minute, you can't just expect python primitives to vectorize? And people live without any(), all(), etc?

Its been a long time since I primarily wrote python, and I guess R has made me lazy, but seriously: WTF?

Maybe hold off on Py3K until this is done?

ivank · on June 24, 2008

any() and all() are available in Python 2.5.

apathy · on June 24, 2008

cool, I can't believe I haven't had occasion to use them yet.

thank you

kurtosis · on June 24, 2008

numpy supports vectorization of most elementary functions