Hacker News new | past | comments | ask | show | jobs | submit login
Functional Programming with Python (kachayev.github.io)
107 points by llambda on Aug 3, 2013 | hide | past | favorite | 91 comments



Look, I love functional programming. I use it all the time and I procrastinate in the shower about the best to add currying to all my favorite languages. That being said, even after all those years, I still find this:

    expr, res = "28+32+++32++39", 0
    for t in expr.split("+"):
        if t != "":
            res += int(t)
    
    print res 
SO MUCH more readable than this:

    from operator import add
    expr = "28+32+++32++39"
    print reduce(add, map(int, filter(bool, expr.split("+"))))
A couple years ago I created a library called Moka in python to make functional and nesting easier. That would be written this way:

    (Moka(expr)
       .do(string.split, '+')
       .keep(lambda x: x != "")
       .map(int)
       .reduce(moka.add))
Still, the imperative version, althought it has state and feels hacky for a functional hacker, is still pretty damn clear and easy to read.


Funny. I've been mostly exposed to imperative programming (learned on BASIC and Pascal), but nowadays I find functional "style" much more intuitive.

The problem with loops and control structures is that they don't tell, at a glance, what data structure you're operating on and what transformations you're doing. It gets exponentially worse with else clauses and nesting. It just feels bureaucratic to program like that.

In a functional style, you have more expressive primitives to work with (map, reduce, filter, fold, ...). That turns the code more concise by focusing on the data structures rather than variable mangling.

That said, I find the syntax of most languages (even functional-orientated ones) a chore. This is an area that could improve a lot.


There is something to be said for functional code which clearly shows the operations being performed in (reverse) order. In Haskell:

    (putStrLn . show . sum . map read . filter (not . null) . split '+') "28+32+++32++39"
or, if you like list comprehensions,

    (putStrLn . show . sum)   [read x | x <- (split '+' "28+32+++32++39"), not (null x)]
But don't underestimate the power of syntax! It's much easier to see the structure of your (d0m's) first example than of the second at a glance. I won't comment on the readability of my Haskell code as I'm already far to conditioned.

(Note that I had to define 'split' by hand; it is a gaping hole in the otherwise very comprehensive Data.List library.)


>(Note that I had to define 'split' by hand; it is a gaping hole in the otherwise very comprehensive Data.List library.)

Have you looked at Data.List.Split (`cabal install split`)? splitOn from that package should do what you want. Admitadly, this should be part of Data.List in the standard library.


I personally find Haskell's syntax a bad example in language syntax. It's super powerful, but with too many inconsistencies, idiosyncrasies and noise. That, on top of having to wrap your head around new or difficult concepts, makes it a chore. Most code I (try to) read is intractable.


> I personally find Haskell's syntax a bad example in language syntax. It's super powerful, but with too many inconsistencies, idiosyncrasies and noise.

Interesting. Could you give an example? I find Haskell to have pretty much the most consistent, non-idiosyncratic and clean syntax of any language I know.


The order of operations is incredibly haphazard. Here is GP:s example again decorated with in which order the expression is evaluated:

    {6}({9}putStrLn . {8}show . {7}sum)   {5}[{2}read x | x <- ({1}splitOn "+" "3++4"),  {4}not ({3}null x)]
Here is the same code expressed in Factor:

    "3++4++9" "+" split harvest [ string>number ] map sum present print
Annotated with evaluation order:

    "3++4++9"{1} "+"{2} split{3} harvest{4} [ string>number{5} ]{6} map{7} sum{8} present{9} print{10}


And here's the GP's other example:

    (putStrLn{9} . show{8} . sum{7} . map{5} read{6} . filter{4} (not{3} . null{2}) . split '+'{1}) "28+32+++32++39"
and here's ak217's Python version:

    expr{1} = '28+32+++32++39'
    print{8} sum{7}(int(i){6} for i in expr.split('+'){2} if{5} i is not ''{4})
Actually, I prefer your Factor example even more than the Haskell (even though I've never heard of the language) because the order of operations is more natural to me, but really I don't think the proposition "Haskell's syntax [is] a bad example in language syntax" has been demonstrated. Comprehensions in Python and in Haskell are an impediment to composability as I mentioned in another post on this story.


Indeed, concatenative languages are written in a very natural way. I wonder if they wouldn't make good teaching languages: you have to learn about a stack, but the evaluation order is trivial (do the first thing, then the second, ...), and there are no variables to contend with.

Haskell programs, on the other hand, are clearly written backwards :-)


Glad I'm not the only one bothered by that.


I find Haskell's syntax to be very clean and with little noise. Much more clean than a multi-paradigm language like Scala.


By noise, I mean:

=, =>, =>>, >>=, ->, |, <|>, ., :, ::, $, `...`, and such.


That's not noise that's incredibly clear signal! (At least for some, I guess others have different taste).


It seems that half of those are operators (functions), so in that case your beef seems to be with the names of many of the Prelude and other commonly used functions, not the syntax of the language. That might seem like a pedantic distinction, but I don't consider naming conventions to be part of the syntax of the language (even though they may influence the look and feel of much code).

As for the rest, well, I guess I feel fine about their place in the syntax (for example the pipe (|) in the list comprehension: what else should be used? It's trying to look like set notation). What would be a less noisy alternative in your opionion? More English words/keywords instead of ASCII symbols?


    expr = '28+32+++32++39'
    print sum(int(i) for i in expr.split('+') if i is not '')


Two important Python notes:

1. `is` really shouldn't be used for string comparison. It checks object identities rather than string equality, which is broken here for two reasons: first, because it depends on the interpreter to keep one true copy of `''` (which CPython seems to do, but doesn't have to by the spec), and, second, it doesn't even work consistently in CPython anyway if `expr` is a Unicode string, since `'' is not u''`. (Try replacing `expr` with `u'28+32+++32++39'` and you'll get an error.)

2. If an object is known to be a string, test non-emptiness by testing the string itself: `bool('') == False`, whereas `bool('any other string') == True`. It's more concise and is equally clear to someone comfortable with Python.

I agree that the given example of functional programming is pretty ugly (though it might've been chosen just to illustrate the author's specific points), but I think the version below is really quite nice and won't break on the string comparison.

    expr = '28+32+++32++39'
    print sum(int(i) for i in expr.split('+') if i)


It's really weird to me that the author prefers map(add, ...) to sum(generator expression).


Because with map you can use any function and sum maps __add__. It also illustrates passing functions as arguments.

More subtly, a generator (or list comprehension) implies the order in which the results are generated and an implicit but observable changing state. If I'm trying to write (or, at least, express, since we are talking about Python) parallelism, map would be a better choice.


More subtly, a generator (or list comprehension) implies the order in which the results are generated and an implicit but observable changing state. If I'm trying to write (or, at least, express, since we are talking about Python) parallelism, map would be a better choice.

This is definitely not true -- generator expressions can be expressed using lazy list semantics. By your logic every Haskell list comprehension would be stateful.

I would encourage you to write the denotational or operational semantics of your expressions in these matters rather than simply going by intuition.


> generator expressions can be expressed using lazy list semantics

That does not change the fact the results are given in a certain order. Lazy list semantics means the data will be calculated as it's needed, therefore, the calculation cannot be parallelized.



I was talking about Python. A list comprehension is defined as having the behavior of a loop.


Once again, that doesn't semantically make a difference, unless the loop is side-effecting. However, even without side effects, Python cannot necessarily make any kind of guarantee about such computations since arrays are mutable and can be modified out from under the map.

This is why parmap equivalents take iterables in Python, not arrays. It doesn't matter if you have lazy or eager semantics, only if your semantics are side-effecting.


It's `reduce(add, map(int, ...))`.

It's more general than `sum(generator)` - you could write `reduce(mul, ...)` or anything else, while with sum you'll get sum and nothing more. Also, I have no problem with `reduce(some_fun, (x for x in iterable))`, which works too.

Generally, while `map` in python is much better written as list comprehension or generator expression, reduce has no equivalent short of explicit loop with accumulator variable.


Generator expressions are a nice addition to Python, but can't replace everything. It doesn't hold some properties, like partial application. In the end it's sugar that works well as one-line loops, but I don't find the result elegant when you mix and match with other high-order functions, lambdas, or when nested. It turns into a prefix/postfix/infix mess.


> It turns into a prefix/postfix/infix mess.

Other functional languages like Haskell handle this "mess" quite well.


I would remove ` is not ''`


This is why Moka was not as interesting as it could have been in another language.


Few things :

- functional is accumulative in thinking. I can evaluate each step gradually without doing anything more than function composition. In imperative programming its coupled.

- functional is side-effect free, can't mis-assign res.

- functional is abstract friendly, I can pass the predicate I want, same for accumulation/combination.

I kinda agree that the imperative code is somehow easier to read (3 little chunks of lines instead of 3 nested composition) but to write, test or adapt the functional way as some advantages.


I agree that functional programming is wonderful, but imperative programs can often look especially clear. Quite a conundrum.

In Fexl (http://fexl.com), you can reduce the nesting by using the ';' ("pivot") syntax. So instead of this:

  print (fold add 0 (map int (filter valid_int (split "+" exp)))
You can eliminate the nesting as follows:

  print; fold add 0; map int; filter valid_int; split "+" exp
You could simplify that form by redefining "int" to return 0 on any invalid input. So we redefine int as follows:

  \int=(\str valid_int str (int str) 0)
Presumably the original int dies on invalid input. Actually I would prefer that the inner int function returns either no or (yes val), i.e. a "maybe", and then we could wrap the default 0 around that. But in any case we now have a robust, fault-tolerant int function, and we can now say:

  print; fold add 0; map int; split "+" exp


I wrote a library to help with this as well.

  f = X.split('+') | where(X) | foreach(int) | sum
  print f(expr)
http://0101.github.io/pipetools/


I love your Moka example. To me it's so much clearer than either of the others and it's basically what I would write in Haskell.

The understandability-complexity of the Moka style scales linearly with the number of components, because it is written "one-dimensionally". The problem with the imperative style is that its understandability-complexity is superlinear in the number of components because any variable in a nested loop can interact with any other variable.

I also like your use of "keep" as opposed to "filter". I can never remember if "filter" means "filter out" or "filter in"!


Thanks. The library was highly inspired from Arc where most of the keywords are short and sweet. (keep, fold, [] shortcut, etc.)


It's often faster too. List/generator comps are the way to go in python. As much as I like the combinatory approach, python does not make it very readable or maintainable.


So I felt really torn with this. I applaud the author for introducing Python devs to FP; I love functional programming. Python was one of the things that opened my eyes to FP with its cute little m/r syntax and for that I'm forever grateful.

Over the years, I have found that python code that people write to solve problems will fall in roughly four categories: a) completely ignorant and inefficient code; b) concise, idiomatic python code with a focus on readability; c) fancy functional code where the author is showing off their FP savvy (peppered with recursion and m/r); and d) raw, boring C-influenced code that looks primitive and boring to read, but is incredibly efficient and algorithmically sound. As much as I used to love going for b), I then prided myself for writing c). But last few years I have realized how under performing c) is for most serious work, so I tend to avoid it and try to strike a balance between d) and b), but preferably the latter. To write code in c) and tell yourself, well the data size will never be so great that the stack will blow, or memory will suffer, is in many ways just like avoiding null checks all together and saying to yourself your data will always have perfect integrity... e.g. any sane person will tell you that you cannot rely on this mindset in the long run.

So, for me at least it comes down to this: to try to pretend that you can write FP code effectively in Python is an exercise in futility. FP is more than writing a map and reduce. If thats your basis of reasoning, yeah I agree take that and run with it - I find m/r to be a neat syntax to use here and there in your program but its really challenging to write a whole program in python that is composed of several chained map / reduce functions to a large extent because things are not oriented around generators by default.. I'm not saying its impossible, just that it goes against the grain. I also argue that recursion and guard expressions (and lazy evaluation) are at the heart of the FP paradigm and if you want to do it right, then its just retarded to sit and do this in Python when there is no tail recursion.. how are you going to appreciate this concept when its so limiting?? If you want to muse yourself and marvel at pretty code, write it like this and have fun with it, but please pick another language if you want to actually ship production code.

And ... so sorry to piss on the parade.


Yeah, i gave up on doing fp (or any mathematics) in python when I got surprised by this:

>>> monomials = [lambda x: x(star)(star)i for i in [1,2,3]] #(not sure how to escape stars for exponentiation properly, whatever)

Those all end up being the same monomial because the integer i is captured by address and not given a new binding for every iteration. Very surprising, and an accident waiting to happen for anyone used to dealing with integers by value (hard to say that with a straight face).

The hackish workaround is "lambda x,i=i: x(star)(star)i" but I'd forget it all the time.


Woah, I was aware of the name capture problem in closures, but I hadn't realised this was one of the consequences. That's very unfortunate.


This is just like how python default values are global variables that if you modify over time (ie, def foo(x = []): x.append(0), the second iteration x would be [0,0], because it is only initialized once.


Not global. Scoped with the "def".


I think zanny meant "static" rather than "global".


Your comment is very close to how I feel. I spend significant effort trying to shoehorn FP concepts into Python, because I wanted to use FP but was forced to use Python. In the end I decided to quit my job so I could use Haskell instead. Programming FP in an actual FP language is much more satisfactory!


I'm very much working on leaving the first category. That's the problem I guess with teaching oneself a language with no code review.


The missing slide: profiling.

I love functional programming. Nonetheless, I strongly believe in using the natural idioms of your language: they are more widely used, more widely optimized, and more often thought about.

I think you would find that the functional style is not idiomatic in Python most of the time. This can hurt readability, performance, and interoperability.

Overall, I would encourage people to do what is natural -- it is often also what's correct. For example, Python generators expressions are lazy sequence semantic constructions and are both functional in semantics and better than the equivalent map/reduce semantics.


Good points. Python lists are surprisingly slightly faster than tuples (which are immutable). That's because lists are extremely common, and more effort went into optimizing them.


The itertools library offers imap, ifilter, etc. for big cases, but I tend to agree that generators read better and are shorter surprisingly often.


I used to dream about FP and Python; then I read about Guido's plans for removing `lambda` and `reduce` (http://www.artima.com/weblogs/viewpost.jsp?thread=98196) and that is when I stopped writing Python.


That disappointed me also, but the FP voice from the community is strong enough. Reduce lives on in functools and the functional package also adds additional functions, like compose. It's one of the packages I install first in a new Python installation. But true, Python will never grow into Haskell.


Also, lambda doesn't appear to be going anywhere. It's one of my favorite features.


Yes, not going away but not growing beyond the crippled, single-statement version it is now...


So, basically, with lots of work, you can make something that's technically functional but with none of the beauty of a language that's meant for it.

I understand the appeal of functional programming, really I do, but why can't we accept that sometimes just because you can doesn't mean you should?


There's little reason to use python instead of haskell these days.

- compiled code, fast

- cleaner, nicer, conciser syntax

- type safety built in

- pattern matching

- numpy equivalents are now available

- purity (see John Carmack talking about how it helps when codebases get large)

To switch from python to haskell easily within minutes, realize that a python function is simply do notation in haskell. So replace assignment (=) with arrow (<-) and write with "func v1 v2 = do".

Then pure functions are those without do and no assignment written lisp/scheme style. "$" means you can get rid of the parentheses.

After that, declaring types is equivalent to expressing how syntax tree can be written. You can use these to plan out how you attack your problem and then let compiler help you when writing out your functions. You could even just write out types and outsource the function writing to someone else (similar to your oo architects can parcel out code writing).

So, all python, golang, and ruby programmers - you really should check out haskell and realize it's superiority. They've taken your niche.

(If anyone wants to pay me to code in haskell, send me an email at haskellpostgresprogrammer@gmail.com, looking for telecommute work)


It's a completely different language with different goals. It's like you said "there's little reason to eat eggs instead of drinking wine these days." Python was expressly NOT intended to be a compiled language with heavy emphasis on type safety.


That sounds plausible on paper, but as a matter of practical fact when I want to get a job done in a hurry and I try to do it in Haskell I get treated to the challenge of solving a series of puzzles in type theory, whereas when I try to do it in Python I just write code and it works, so while I agree Haskell is an interesting and elegant language, when what I care about is accomplishing a result rather than the mechanics of the language, I reach for Python.


It is very nice and I would agree with you. I had a lot of problems interacting with the IO monads in the beginning though, and the type system in a production environment is a lot more intricate than what most of the literature purports ;).. Other than that it was pretty easy to start writing code without knowing much.


If you're forced to write Python and you like FP I think it makes sense to try to twist the Python you write towards FP as much as possible. The advice "Stop writing classes" is particularly helpful in my experience.

However the "What did we miss?" slide tells it like it is. FP in Python is never going to be like FP in, e.g., Haskell. Moreover, the libraries you use are going to have objects and mutable state everywhere which ends up being very frustrating.

If you're not forced not to use FP, I recommend just going with Haskell (or OCaml or Scala or whatever according to your taste).


I recommend just going with...

If you're writing some kind of server, I'd nominate Erlang for serious consideration. The OTP framework is fantastic and rock solid.



I'm a python-native programmer who just recently picked up SICP. I've been trying to use some more functional tools; I told a friend at work one day that I was going to try to do everything with just map and filter.

But I also just had an experience I haven't had much since I was extremely new: the experience of going back to something I wrote last week and not being able to decipher it. I'm going to keep it up, partly because I like it and also because I have some hope (unjustified?) that map will give me better performance than looping. I wish there was a way to enable tail recursion though--seriously, I promise I'll ask for clear stack traces when I need them.


http://wiki.python.org/moin/PythonSpeed/PerformanceTips

List comprehensions beat map in CPython. As a fellow pythonista, I can guarantee you that using the builtin higher-order functions excessively will cause you nothing but pain. The feeling of being lost when you revisit code will be common.

I had a coworker who undertook a plan similar to yours, except he gave himself less rigid constraints. He is a very smart guy but he fell into the tar pit of code masturbation and couldn't finish the project he took on. He simply couldn't manage the complexity of his application in his quest to be As Functional As Possible. The project was never completed and after wasting many man hours, the company was forced to purchase $25,000+ worth of software to compensate. It didn't reflect well on him.

Python has functional programming support, but it's not intended to be the dominant paradigm. Python 3 even removed reduce as a builtin. The high order functions are still useful, but definitely not all the time. The principles of functional programming, such as limiting side effects and statefulness, are always useful. But Python has awful lambdas and no lexical scoping (fixed in 3). Python's functional support will always be hobbled as Guido doesn't support it.

Since you seem to have a healthy interest in functional programming, I'd suggest looking at Clojure. If you want to challenge yourself, it's a good choice as you'd have to learn lisp (macros, code is data, s-exprs, etc) and the JVM ecosystem, while being almost totally constrained to writing functional code. Haskell is even more strict and is a pure functional language, forbidding state entirely and relying on monads for things like I/O. I almost like it as much as Clojure. Erlang and Ocaml are also worth checking out.


@dorolow, your comment is dead because it used the auto-banned word "mast?rb?tion" (uncensored). FYI.


What? We've got auto-banned words?


Try it and see.


Bizarre. I guess I'll have to be more careful. Thanks for the heads up


Not pretty, but it does the trick. Kinda. Or at least it won't destroy the stack if you recurse too deep.

  from collections import namedtuple

  TailCall = namedtuple('TailCall', ['function', 'args'])  

  def tail_recurse(fn, args):
      next_call = fn(*args)
      while isinstance(next_call, TailCall):
          next_call = next_call.function(*next_call.args)
      return next_call  

  def count_to_4(start):
      """Dummy tail-recursive function"""
      if start == 4:
          return 4
      else:
          return TailCall(count_to_4, [start + 1])  

  print tail_recurse(count_to_4, [-6])


everytime I see

    reduce(operator.add, ...)
and variations like that I think GvR was right to relegate reduce to the functools module... it's just that

    sum(...)
covers the most common (90%? only?) usage...


God no. If your loop isn't a map, it's almost certainly a fold. If you free up your syntax to make them easier to produce then they show up everywhere. Sum is a menial example.


> If your loop isn't a map, it's almost certainly a fold.

...all loops can be represented as folds


Not strictly true, but all maps can be represented as a fold. That's not usually the best tool though.


To clarify this now that I'm not on my phone, folds may not express termination by themselves, they depend upon the type that they're folding up. More formally, folds are catamorphisms which operate on data, but you often need to use codata to generate the structures to fold over.


Why not use a loop instead though?


Because you need to track and update the temporary state. Maps and folds let you define your stepping algorithm separately.


In python? I wasn't aware that fold did anything other then iterate in order...


Not just that. More generally, comprehensions cover the most common use cases and are tremendously more readable.


Comprehesions are more readable in the sense that they read more like "English" or "natural language". Unfortunately they're inherently non-composable and make refactoring harder.

d0m's Moka example (https://news.ycombinator.com/item?id=6150944) is closer to composable. I would suggest the following modification. Instead of chaining methods, have a function "chain" which composes functions. We could then write

    chain(do(string.split, '+'),
          keep(lambda x: x != ""),
          map(int),
          reduce(moka.add))
If I decided that removing blanks and converting to int was a concept that was worth capturing in its own function, all that I need to do is splice out part of the chain as is.

    non_blanks_to_int = chain(keep(lambda x: x != ""),
                              map(int))
    
    chain(do(string.split, '+'),
          non_blanks_to_int,
          reduce(moka.add))
No modifications were required. It's more complicated with comprehensions. Of course it's still possible, but it's one more unneccessary grain of sand in the mechanism. Doing the same with the imperative version would be nigh-on impossible.

Come to my FP days talk in Cambridge, UK, Thursday 24th October, 2013 if you want to hear more about this!

http://lanyrd.com/2013/fpdays/sckzkz/


If you take the awful step of naming things, comprehensions are easy enough to chain. That is (in untested python...)

    parts=s.split('+')
    keep=(s for s in parts if s)
    ints=(int(s) for s in keep)
    sum(ints)
Hiding the middle steps there in a function is easy enough.

That split returns a list, but the other steps are lazy.


But naming your intermediate variables means it's not possible to splice parts out.


You did call it a grain of sand, but I suppose I'm responding to where you call it complicated. Moving a couple of the names into a separate function means passing in one of them and editing the last line (at least in python with explicit returns). That meets the meaning of more complicated, but (in my opinion) in a sort of hair splitting way.


Horses for courses then I guess. If the "easier to refactor" argument doesn't convince you maybe the "easier to read" argument will. Personally I strongly prefer zero-length variable names, but everyone has their own taste.


Not just the common ones, pretty much every one. Python 3 even removed the reduce builtin because it was so unreadable. Generator expressions and list comprehensions do so much more in a much more elegant way.


How can they do "so much more" when both map and filter can be implemented in terms of reduce, but the opposite is not true?

I can agree with you in terms of readability, even as a mathematician. For example, the list comprehension syntax mirrors the mathematician's "set-builder notation," so it feels very natural.

However, consider something like a "max", "min", or "argmax" function. Like map and reduce, you can implement this in terms of reduce, but I don't see how you can implement it in terms of a list comprehension. Or consider something like an RPN calculator, which could be implemented as a single reduce call whose initial value is an empty stack and whose input list is the list of RPN tokens.

It seems that list comprehensions can do less in a very objective sense and couldn't be used to replace reduce without other supporting characters. Python has those supporting characters, of course, so there's a good 80/20 argument for favoring list comprehensions over a more opaque call to reduce.


"Like map and filter," I mean, not "Like map and reduce."


Comprehensions can only be used to replace map & filter, not reduce.


If you define __add__ appropriately then it covers 100% of the use cases :)


Love the video mentioned in the slides:

https://www.youtube.com/watch?v=o9pEzgHorH0


Great presentation! Just a question/remark: Partial function application seems handy but using functools.partial to create a new function seems obsolete since you can use non-keyworded variable length argument lists with *args in your function definitions?


The official Python docs also have a section on FP.

http://docs.python.org/dev/howto/functional.html


wow, i love your presentation design, type and function :)


is this a reveal.js rendering of an IPython notebook ?

    ipython nbconvert --to slides <notebook.ipynb>
http://ipython.org/ipython-doc/dev/interactive/nbconvert.htm...


It gave me motion sickness, was not expecting transparent, 3D rotation.


The part I enjoy most is the Lato font. It looks really sweet.


this code is so smart, maybe because it is Python after all

    from operator import add
    expr = "28+32+++32++39"
    print reduce(add, map(int, filter(bool, expr.split("+"))))
but in JavaScript it looks much more readable:

  "28+32+++32++39".split(/([\+]+)/g).reduce((r,v) => r+v, 0);




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: