You could also use the list comprehension form of that: [x for x in mixed_widget...

teddyh · on Jan 9, 2017

First, if your’re only interested in the first value, you should probably use a generator expression instead of a list comprehension, otherwise the loop will run for all mixed_widgets even though they won’t be used:

    (x for x in mixed_widgets if 'widgets' in x)[0]['widgets']

but this doesn't work, since you can’t do indexing [0] on a generator expression. No matter, next() returns the first value of any iterator:

    next(x for x in mixed_widgets if 'widgets' in x)['widgets']

Also, I think it the repetition of x in the 'x for x in' part is a bit ugly, we can fix that by moving the ['widgets'] attribute retrieval operation to inside the generator expression:

    next(x['widgets'] for x in mixed_widgets if 'widgets' in x)

m_mueller · on Jan 9, 2017

definitely more readable - it describes in basic english words what it's doing and it's basically the same as mathematical set notation. The main question to me is the performance implication though. Are generator expressions doing the iteration at C level like map and does that give performance parity then? What about branching - am I correctly assuming that filter is always faster than comprehensions or generator expressions with if-parts?

clusmore · on Jan 9, 2017

Must watch: https://www.youtube.com/watch?v=OSGv2VnC0go

I believe that the list comprehension will be faster than filter, but as always, any time you replace readable code with unreadable code for performance reasons, you damn well better time it.

viraptor · on Jan 9, 2017

~4.5 times faster with comprehension.

    In [11]: timeit.timeit('''list(filter(lambda x : ('widgets' in x), mixed_widgets))[0]['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
    Out[11]: 2.6956532129988773

    In [12]: timeit.timeit('''[x for x in mixed_widgets if 'widgets' in x][0]['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
    Out[12]: 0.5911771030077944

But not generating the list at all is still going to be faster (with bigger gains for bigger data)

    In [13]: timeit.timeit('''next(x for x in mixed_widgets if 'widgets' in x)['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
    Out[13]: 0.3324074839911191

m_mueller · on Jan 9, 2017

I guess my next question is then - why is anyone using filter/map instead of comprehensions or even generator expressions? Familiarity when coming from FP?

clusmore · on Jan 9, 2017

Well, for one I believe they pre-date comprehensions. Having them as functions is also occasionally useful for partial application, e.g.

  from functools import partial
  to_strings = partial(map, str)
  # vs
  def to_strings(seq):
      return (str(elem) for elem in seq)

m_mueller · on Jan 9, 2017

I can see it together with partial, yes, that's when it can become a bit cleaner. Another reason why I use map is when I want to use multiprocessing or multithreading (with IO heavy functions). But on a fine grained level of code I find it really hurts readability compared to comprehensions.

EvilTerran · on Jan 9, 2017

Or this:

    def to_strings(seq):
        return map(str, seq)

Generally, when the operation I'm applying to each element happens to already be a named function, I find "map(f, seq)" preferable to "(f(x) for x in seq)".

pmontra · on Jan 9, 2017

Coming from Ruby it's easier to use map, because it's what Ruby's standard library offers.

However comprehensions are not that harder when one finally decides to understand how they work. Not as readable as map() IMHO. Example:

Ruby

    [1, 2, 3].map {|x| x*x} # object.method(args)

vs Python

    [x*x for x in [1, 2, 3]]

where we have the function first, then the definition of the variable, then the data. This is the opposite of the object.method OO notation and using a variable before defining it is not what we usually do. But it's almost the usual mathematical notation "for i in set do f(i)" with the function at the beginning.

Not a big deal.

About a problem raised in a comment of the post (which is from 2009): this is Guido (2009) about the lack of tail call optimization in Python http://neopythonic.blogspot.it/2009/04/tail-recursion-elimin...

viraptor · on Jan 9, 2017

In my experience filter/map are used by people who just don't know about comprehensions, or are not used to having them available. It takes some time to start using them where properly.

m_mueller · on Jan 9, 2017

That's actually the talk I was thinking about, but I guess I forgot how exactly Raymond described generator expressions there. Thanks for linking it again!