First, if your’re only interested in the first value, you should probably use a generator expression instead of a list comprehension, otherwise the loop will run for all mixed_widgets even though they won’t be used:
(x for x in mixed_widgets if 'widgets' in x)[0]['widgets']
but this doesn't work, since you can’t do indexing [0] on a generator expression. No matter, next() returns the first value of any iterator:
next(x for x in mixed_widgets if 'widgets' in x)['widgets']
Also, I think it the repetition of x in the 'x for x in' part is a bit ugly, we can fix that by moving the ['widgets'] attribute retrieval operation to inside the generator expression:
next(x['widgets'] for x in mixed_widgets if 'widgets' in x)
definitely more readable - it describes in basic english words what it's doing and it's basically the same as mathematical set notation. The main question to me is the performance implication though. Are generator expressions doing the iteration at C level like map and does that give performance parity then? What about branching - am I correctly assuming that filter is always faster than comprehensions or generator expressions with if-parts?
I believe that the list comprehension will be faster than filter, but as always, any time you replace readable code with unreadable code for performance reasons, you damn well better time it.
In [11]: timeit.timeit('''list(filter(lambda x : ('widgets' in x), mixed_widgets))[0]['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
Out[11]: 2.6956532129988773
In [12]: timeit.timeit('''[x for x in mixed_widgets if 'widgets' in x][0]['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
Out[12]: 0.5911771030077944
But not generating the list at all is still going to be faster (with bigger gains for bigger data)
In [13]: timeit.timeit('''next(x for x in mixed_widgets if 'widgets' in x)['widgets']''', '''nw={'abc': 'def'};w={'widgets':'s'};mixed_widgets=[nw]*100+[w]+[nw]*100''', number=100000)
Out[13]: 0.3324074839911191
I guess my next question is then - why is anyone using filter/map instead of comprehensions or even generator expressions? Familiarity when coming from FP?
I can see it together with partial, yes, that's when it can become a bit cleaner. Another reason why I use map is when I want to use multiprocessing or multithreading (with IO heavy functions). But on a fine grained level of code I find it really hurts readability compared to comprehensions.
Generally, when the operation I'm applying to each element happens to already be a named function, I find "map(f, seq)" preferable to "(f(x) for x in seq)".
Coming from Ruby it's easier to use map, because it's what Ruby's standard library offers.
However comprehensions are not that harder when one finally decides to understand how they work. Not as readable as map() IMHO. Example:
Ruby
[1, 2, 3].map {|x| x*x} # object.method(args)
vs Python
[x*x for x in [1, 2, 3]]
where we have the function first, then the definition of the variable, then the data. This is the opposite of the object.method OO notation and using a variable before defining it is not what we usually do. But it's almost the usual mathematical notation "for i in set do f(i)" with the function at the beginning.
In my experience filter/map are used by people who just don't know about comprehensions, or are not used to having them available. It takes some time to start using them where properly.
That's actually the talk I was thinking about, but I guess I forgot how exactly Raymond described generator expressions there. Thanks for linking it again!
Related: I wish the list type in Python included an analogue to dict's ".get(key, default)" operation.