Pain Points of Haskell

pdimitar · on June 8, 2020

As a guy who casually (definitely not in-depth) reviewed working with Haskell about 2 years ago I'd say the most relatable part of the article to me is: complexity in tooling and unnecessary tension when working with libraries. Also the String situation (several types of them).

A lot of the other points could be addressed but the community's seeming unwillingness to tackle everyday productivity sends to me the message: "Keep out!". :(

As a comparison, Elixir's Mix and Rust's Cargo are amazingly ergonomic build systems and provide all the tooling you need to work with the languages and your projects.

And hey, don't kill me. As I said, I reviewed it casually for several days while comparing it with Rust and OCaml. They aren't the easiest to start with either but Rust's tooling is excellent, OCaml's is also excellent but not immediately obvious (their docs have improved a lot), and Haskell's is... kind of there, as the article points out.

2mol · on June 8, 2020

The tooling situation has rapidly improved in my opinion, nowadays ghcide [0] (a language server implementation) is really great and easy to set up.

I really do hope that not too many people feel like the message is "keep out". In recent years the community has generally strived to be welcoming to beginners and share the excitment of learning Haskell, without the arrogance.

The aforementioned tooling improvements were only possible because a bunch of excellent people did exactly what you mention: tackle everyday productivity and pleasantness by improving IDE integration deep into the stack, all the way to the information the compiler makes available to the tooling.

If you want to get a fresh impression, ZuriHac [1] is happening (remotely) this weekend, and there is an online class for absolute beginners held by the excellent Julie Moronuki (of haskellbook.com fame).

Disclaimer: I'm very biased, because I'm a co-organizer, but I think it will be great fun!

[0]: https://github.com/digital-asset/ghcide

[1]: https://zurihac.com

misja111 · on June 8, 2020

>> ghcide .. is really great and easy to set up

Honestly, I tried to set up ghcide because I would love to have a good ide for Haskell but stopped because the setup procedure is not easy at all. Or maybe I'm getting something completely wrong?

On the link you are giving, there are the following steps:

- clone a git repo

- run cabal install

- test that ghcide is able to load your code (!) Quote: "If it doesn't work, see the hie-bios manual to get it working."

- finally integrate it in the ide that you are currently using. I am using IntelliJ IDE with the Haskell plugin and apparently I am out of luck because it is not listed.

tome · on June 8, 2020

> finally integrate it in the ide that you are currently using. I am using IntelliJ IDE with the Haskell plugin and apparently I am out of luck because it is not listed.

Well, is it an IDE that supports LSP? If so, it should just work. If not, then ghcide won't work with it. Providing a language server is the whole point of what it does!

misja111 · on June 8, 2020

There is a LSP plugin for IntelliJ but it is not very stable. But that was not my point: maybe I'm spoiled because I'm coming from the JVM universe, but I don't agree that ghcide installation is easy. For me, "easy" means running an installer or invoking homebrew, and continue being productive.

tome · on June 8, 2020

Yes, fair enough. There's a lot of scope for making it easier :)

bjz_ · on June 8, 2020

> OCaml's is also excellent but not immediately obvious (their docs have improved a lot)

Interesting! The last time I tried OPAM, it actually seemed more frustrating than Cabal! Maybe it's improved? Last time I tried OPAM it would install packages globally by default, and avoiding that was a confusing process, when that should be the default behavior.

Cargo (while not perfect) has really nice defaults out of the box - ie. it generates a lockfile by default, scopes packages locally to individual projects, and lets you use more than one package of the same version in the same project.

Really hoping OCaml and Haskell can improve their package management story - they are getting there, but it still holds me back from really using them on a daily basis.

nextos · on June 8, 2020

A bit off topic, but is Haskell viable for production? Or is it just really intended as an experiment? Is it worthwhile to jump into Haskell now, or perhaps one should look into Idris or Agda? Any companies using Haskell where it has proven a distinct advantage?

Ocaml and SML (the former being sadly quite unpopular these days) are an order of magnitude simpler than Haskell. They are easy to master. I have been quite productive with Scala which, while being a member of the ML family, is much more intricate due to OO + FP on top of great Java interop and many Haskell-like features (laziness, monads, lenses...).

Haskell doesn't feel the same. I have learned some Haskell over the years. Since I come from a pure FP education (started with SML) and I have lots of type theory background, everything feels familiar, yet I've found it really hard to be productive or to justify the mental overhead of laziness. I have developed some small projects in Haskell, but I haven't found it a productive alternative to the languages above.

bojo · on June 8, 2020

My company writes and supports a lot of custom Haskell software for our parent, an ISP up in Alaska.

What we gain is a high level of confidence in our business logic rules due to strict typing, fewer overall tests, and by avoiding a few small gotchas uncrashable runtimes.

Not to say it doesn’t come without problems. Poor IDE integration (I personally don’t care), questionable documentation at times, and a steep rewrite your brain learning curve.

Some posts I have written.

https://www.alasconnect.com/2018/10/02/introducing-haskell-c...

https://www.alasconnect.com/2018/10/04/productive-haskell-en...

http://blog.bojo.wtf/2020/04/15/is-haskell-a-bad-choice/

MaxGabriel · on June 8, 2020

I'm the CTO at Mercury (https://mercury.com/) and we have 100% of our backend written in Haskell. It's gone really well so far.

Answering another commenter's question, I would say the "secret weapons" are:

1. Hiring: Haskell is a very in-demand language by very good engineers. For a startup, it's absolutely amazing for recruiting and I can't overstate how important recruiting good people is for a startup.

2. Correctness: Haskell helps you ensure your code doesn't have bugs. For a bank this is fairly important. Some aspects of Haskell that are great for this: algebraic data types ("discriminated unions" in some languages) model business logic quite well, purity makes a function much more of a known quantity, libraries like Persistent ensure type-safe SQL queries, and in general the community cares about correctness.

I haven't used the other languages you mentioned, so I can't really speak to a comparison with them. Re: laziness, it's almost never something I think about in practice.

klipt · on June 8, 2020

> 1. Hiring: Haskell is a very in-demand language by very good engineers. For a startup, it's absolutely amazing for recruiting and I can't overstate how important recruiting good people is for a startup.

Although for engineers, supply-and-demand favoring employers typically means it disfavors employees: you may effectively be taking a pay cut to use Haskell compared to more popular languages.

lostcolony · on June 8, 2020

Sure, but those that are willing to because they're sufficiently excited about the language means you can have a higher confidence in your hires, since they, A. Know what Haskell is, B. Want to learn it, C. Are willing to take a pay cut to be able to get paid to learn/use it. Very high signal to noise I'd wager.

If you're a Java shop, you better be paying higher than market rate if you want to be able to attract good talent, but then you still have to figure out who they are amidst all the mediocre/poor devs, who are interested because of the higher comp. Very low signal to noise.

It seems like a less popular language, that has real business value, helps achieve the thing every company that isn't "hire fast, fire fast" tries to do with their hiring policy. It does mean you can't easily hire people who already know the language, so you have to consider the ramp up time, but I don't think that's nearly as painful as many hiring managers seem to think it is.

klipt · on June 8, 2020

Well yeah, it seems you're reiterating that it's great for employers, not for employees. I don't disagree.

Of course if Haskell somehow takes off and all the FANG companies want to hire as many Haskell engineers as they can, then the situation would probably reverse.

lostcolony · on June 8, 2020

I'm more calling out that it's mutually beneficial; the employees choosing that tradeoff are still choosing that tradeoff. The other tradeoff, work for a higher compensated position in a language that they hate, isn't objectively "better", just different.

marcosdumay · on June 8, 2020

The labor market is not zero-sum.

If Haskell is just a signal that selects for more competent people, companies will be happy to hire people whose competence they would have to invest a lot to match otherwise, and developers will be happy to get a larger salary than they would be able to get otherwise. (But yes, this would disfavor people with other kinds of signal.)

If it selects for high productivity people, or if it helps people be more productive, companies will be happy from getting more output than they would otherwise, and developers will still get more money.

The stereotype is that Haskell jobs pay less, and developers are happy to work with a better language. That still doesn't disfavor any party, but I don't think anybody has any evidence that this is the case. Notice that the GP is talking about people competence, not salary.

smabie · on June 8, 2020

You really think so? Take for example, kdb+/q. Not exactly widely used, but engineers who specialize it are making top dollar, it's not unusual for total comp to be around a million dollars, or sometimes a lot more, if you're the principle architect of a new system.

klipt · on June 8, 2020

You're just saying that kdb has the opposite imbalance as Haskell (relative to mainstream languages).

If supply and demand for kdb favours employees, then presumably it disfavors employers, who would rather have a bigger pool of kdb programmers so they could pay less.

That doesn't contradict what I said about Haskell vs mainstream languages. I didn't say all niche languages are like Haskell.

tome · on June 8, 2020

To answer your questions literally without much nuance:

Yes, Haskell is viable for production. There are caveats, as with any other language. Haskell has more caveats that most other languages you would be familiar, but it's nonetheless perfectly suitable in the right environment.

No, don't use Agda or Idris in production, they're totally unfit. Agda is not intended to be. Idris may be but it will take 5 or 10 years.

The only members of the ML family that I'm aware of are Ocaml and the flavours of Standard ML. Only Ocaml is fit for production.

mr_custard · on June 8, 2020

> The only members of the ML family that I'm aware of are Ocaml and the flavours of Standard ML. Only Ocaml is fit for production.

F# is also a member of the ML family and is totally fit for production.

FridgeSeal · on June 9, 2020

F# is the language I want to like, and would like to use in prod, but it constantly feels like the ignored step-child.

The REPL is great in theory, but trying to import packages is a nightmare every time I try.

Dependencies exist, but they all seem to target wide and inconsistent range of the ecosystem. Want to target dotnetcore 3.1? Good luck have fun: everything you find useful is targeting some combination of standard/framework/core/whatever confusing variant Microsoft decides to come up with this week.

If I somehow convinced my co-workers to adopt a language that wasn't C#, I'd point them Rust or Haskell instead.

akra · on June 10, 2020

Most of those package issues also apply to C# particularly as it stabilizes towards .NET 5. However it isn't too bad once you get it; and for most standard use cases it seems to work with the dotnet cli out of the box. Many people seem to be productive with C# so I'm not sure how big of an issue it is other than the small initial learning curve but many package managers have that (IMO Maven is harder yet many people use that in Java space and not too hard once you learn it).

REPL's often have problems with package management in a number of languages as they often have a different build chain from the compiled style apps which have the chance to resolve packages. When doing .NET Core in the past I know Paket can generate these scripts for the REPL to import packages (seen csx/fsx scripts for each package) - but another comment alludes to a more supported way in the future which is good.

Doesn't seem like you have a problem with the language per se; rather the package management story for .NET Core.

Paradigma11 · on June 9, 2020

I totally agree with your points but the situation seems to be improving. With https://devblogs.microsoft.com/dotnet/announcing-f-5-preview... you can import nuget packages in fsx scripts. The dependency situation also seems to grow saner every month.

Nevertheless i would still wait another few month for things to stabilize with .net 5.

tombert · on June 8, 2020

I think that F# is ok for production. I used it at a previous job without too many pain-points, especially in comparison to Haskell. I'm not a huge fan of MSBuild, but it works relatively well, and of course you have access to the massive .NET library and the entire ecosystem of C# libraries out there, and there's even some tooling to make the C# stuff play nicely with the F# constructs.

tome · on June 8, 2020

I totally forgot F# was part of the ML family, but can't edit my comment now! Thanks to everyone who replied to correct me.

_bxg1 · on June 8, 2020

There's ReasonML, which compiles to JS and seems intended for production

tomku · on June 8, 2020

Reason is a syntax layer on top of OCaml, so it shares OCaml's production-readiness. The tooling that Reason uses to compile to JS (Bucklescript) was originally designed for OCaml, and Reason supports native targeting via the plain OCaml compiler backend as well. It's a pretty pleasant ecosystem at the moment.

kqr · on June 8, 2020

In the same space we also find Fable, which hooks into Babel to compile F# to JS. We're using it in production and it's been surprisingly good. It's very nice to be able to share the definitions of data structures between the C# backend and JS (via F#) front-end.

smabie · on June 8, 2020

ReasonML serves no purpose whatsoever. The syntax is worse than OCaml's (though, more regular) and the editor tooling is really awful. OCaml compiles to JS just fine.

ska80 · on June 8, 2020

Idris 2 is a big step towards a production grade FP language

the_af · on June 8, 2020

> A bit off topic, but is Haskell viable for production? Or is it just really intended as an experiment?

It's viable for production and there are businesses which use it, notably fintechs. Also Facebook for some things (Facebook employs Simon Marlow [1]). And I know this is an irrelevant anecdote, but I have a friend who has been working in several startups which use Haskell for the last few years. Like Paul Graham has argued of Lisp, Haskell really is a "secret weapon" and people who use it for production tend to love it.

EDIT: a prior version of this post claimed Facebook employed Philip Wadler, which is incorrect. Sorry! I got my wires crossed and confused Philip Wadler with Simon Marlow!

[1] https://www.linkedin.com/in/simonmarlow/

nextos · on June 8, 2020

> Haskell really is a "secret weapon" and people who use it for production tend to love it

That's exactly my question. I'd be really interested in hearing about domains where Haskell is a secret weapon and how it compares to ML family languages and dependently-typed ones.

implicit · on June 8, 2020

I did a bunch of professional Haskell work in a prior life[1]. Most of its 'secret weapon' status springs from the way Haskell lets you control side effects.

We had a fantastic unit testing harness[2]. With effect tracking, you can arrange for your harness to put precise fences around nondeterministic code. You don't need to rely on experience and discipline to ensure that tests are reliable or fast. The type system does it for you.

Effect tracking also makes concurrent code easier to write. You have a great deal of control over how and when threads can be created, and what kinds of logic is permitted on those threads.

GHC's IO manager offers green threads and async IO, so rather than a select loop, you just fork threads and perform "blocking" IO calls. You get about the same efficiency as a select loop this way.

Lastly, something we learned to appreciate over the years is that Haskell is easier to refactor. When you combine the rigid type system, effect tracking system, and ease of writing unit tests, it becomes very easy to refactor old Haskell. It doesn't really matter how bad the old code was or whether the original author is around to help you at all.

If you intend for your product to evolve continuously forever, this is truly a secret weapon.

    [1] https://andyfriesen.com/2014/03/25/what-its-like-to-use-haskell.html
    [2] https://andyfriesen.com/2015/06/17/testable-io-in-haskell.html

6gvONxR4sf7o · on June 8, 2020

Of all the things I like in haskell (having only used it a as a hobbyist), refactoring tops the list. It's a lot less scary to refactor when the compiler tells you so much about which parts you missed.

ghostwriter · on June 8, 2020

https://news.ycombinator.com/item?id=23367803

the_af · on June 8, 2020

I'll let people who actually use it -- instead of toy with it, like myself -- answer you. I know many haskellers read HN and have answered similar questions in the past.

Like I said, I do know Haskell is used at fintechs (source: a friend whose job is precisely that).

auggierose · on June 8, 2020

Facebook employs Wadler? Since when?

the_af · on June 8, 2020

Gah! I meant to say Simon Marlow [1], co-developer of the Glasgow Haskell Compiler (GHC) [2]. Sorry for the confusion, I'll edit my other post now.

  [1] https://www.linkedin.com/in/simonmarlow
  [2] https://en.wikipedia.org/wiki/Simon_Marlow

alephu5 · on June 8, 2020

I've got a REST API in production linking to postgres and rabbitmq and I'm very happy with the performance and maintainability. The ecosystem is a bit more bare than with other languages like node, python or java but in exchange the paradigm and type-system lets you build safe and well isolated code. In my opinion it's also quite easy to read despite being terse.

Despite the complaints the build tooling is good, and there are decent plugins for IDE capabilities although they're a little bit resource hungry.

My biggest problem is the lack of component libraries, for example I wanted to use SMTP a little while ago but couldn't find a library compatible with the latest GHC so ended up writing a python script connected to rabbitmq.

For business logic involving timezones, currencies, API calls, calculations etc. you're pretty well covered though.

embwbam · on June 8, 2020

I’ve used it on multiple projects for clients over the last 5 years. These projects were large, made it to production, and were successful. I love it. Once you clear the initial learning curve it feels very productive, and gives you the feeling of always having something more to learn and master.

nextos · on June 8, 2020

What domains?

hazz99 · on June 8, 2020

Hasura (one of my favourite technologies) is built with Haskell!

ghostwriter · on June 8, 2020

> Really hoping OCaml and Haskell can improve their package management story - they are getting there, but it still holds me back from really using them on a daily basis.

Cabal 3.x with Nix-style buildscovers all packaging needs that I ever had with Haskell ecosystem[1]. And there's a new wave of tooling based on incremental Nix builds, that you can begin using today [2]

[1] https://cabal.readthedocs.io/en/latest/nix-local-build-overv...

[2] https://github.com/nmattia/snack

bjz_ · on June 8, 2020

Yeah, it's really great to see the progress there. However, afaik, it still doesn't freeze packages by default, or let you have multiple packages of the same version in a dependency tree[1]. The former can be worked around, but it's annoying that it's not the default. The latter is more frustrating however!

[1] https://news.ycombinator.com/item?id=23454711

lallysingh · on June 8, 2020

Stack does this, if I understand what you're saying. Stack+nix has worked quite well for me for a few projects.

marcosdumay · on June 8, 2020

Yep. Cabal 3.x is great, better than anything else I've seen for any other language.

As the article says, it just needs to correct the default behavior (deprecating the old behavior is on the roadmap, interestingly, the documentation doesn't even say what is the standard right now) and some improvements on usability (just better docs goes a long way).

talex5 · on June 8, 2020

By default, opam installs everything into the current "switch". Typically you have switch one per compiler, but you can create one per project. You can also create a "local" switch directly inside your project directory: https://opam.ocaml.org/blog/opam-local-switches/

Another option is to use the duniverse tool (https://github.com/ocamllabs/duniverse). That downloads all dependencies to the project directory, and then builds them all together as one giant project. That makes it very easy to make changes across multiple libraries at once, while keeping fast incremental builds.

And a final option is to build projects with Docker, which allows you to have separate versions of non-OCaml packages too.

bjz_ · on June 8, 2020

Ahh cool - had some similar questions here: https://news.ycombinator.com/item?id=23460980 - mainly, how much manual switching do you have to do? Or is it seamless, depending on what project directory your in? I think I tried the local switches in the past and got really confused when switching projects and everything broke, thinking 'wasn't all this meant to prevent this?'.

Erlangen · on June 8, 2020

Packages are not installed globally in opam by default nowadays. `opam switch` "enables the user to have several installations on disk, each with their own prefix, set of installed packages, compiler version, etc". https://opam.ocaml.org/doc/Usage.html#opam-switch

bjz_ · on June 8, 2020

Oh that's nice to hear! Some questions:

Do you have to run this command manually, and does it mutate the shell state? That's one thing that frustrated me with opam in the past as well. I couldn't just jump into a directory and build a thing, then switch to another project directory - there was a lot of manual switching and unswitching of packages if I recall correct?

Is it possible to install multiple tools globally using opam that use disjoint library versions? Like, I might want to install Abella and Coq side-by-side, but they might have conflicting version requirements. I think I was super excited about opam 2, then tried installing one thing, only to have it break again when I installed something else.

Is it possible to have multiple versions of the same libray in the same project, or does the constraint solver need to find a single solution for each library? [1]

[1] https://news.ycombinator.com/item?id=23454917

Erlangen · on June 19, 2020

Sorry I miss your reply. I create opam switches to have different versions of compiler, or allow libraries that conflict to live at different branches.

1. Dune is a relative new build system for Ocaml. You just write a configuration file, and dune will handle switches automatically. "It also supports multi-context builds, such as building against several opam roots/switches simultaneously. This helps maintaining packages across several versions of OCaml and gives cross-compilation for free." See https://opam.ocaml.org/packages/dune/

2. Yes. You can can multiple versions library at different switch.

3. I am not sure.

pdimitar · on June 8, 2020

I got told some months ago to look for these docs: https://news.ycombinator.com/item?id=22395595

It seems all we want can be made to work but it does require some tinkering and being aware of the tools' options.

andrepd · on June 8, 2020

Maybe you tried it before v2? Imho opam now is very usable, lets you be productive for 95% tasks after reading a 5min introduction, and generally gets out of your way.

ghostwriter · on June 8, 2020

It is in your interest to have multiple string-like datatypes in a lazy language, especially when some of them are not real strings, but rather streams of binary data.

https://mmhaskell.com/blog/2017/5/15/untangling-haskells-str...

chriswarbo · on June 8, 2020

I agree, although the arguments in that article aren't the strongest. In particular:

The main reason given for 'String' being slow is that it's immutable and hence leads to many duplicate Strings being created. In fact, the main reason Strings are slow is that their linked-list structure has a lot of overhead, and may be scattered around in memory. For example, the String "abc" will be represented something like this:

    +---+---+   +---+---+   +---+---+
    | * | *---->| * | *---->| * | *---->[]
    +-|-+---+   +-|-+---+   +-|-+---+   
      |           |           |
      V           V           V
     'a'         'b'         'c'

Those arrows are pointers, which can point to locations arbitrarily far away. From this we can see that just storing a (fully evaluated) String is expensive, since each character requires a pair of pointers (in addition to the unique characters themselves). Processing the contents of this String is also slow, since we need to dereference two pointers for each character (one to get the character, one to get the tail of the String). Since the data may not be contiguous in memory, this can make poor use of the CPU caches, which might otherwise speed up these dereferences.

Compare this to a C-style string, which would look like this in memory:

    +---+---+---+---+
    | a | b | c | 0 |
    +---+---+---+---+

This "packed" representation requires no long-distance dereferencing, the memory is contiguous so it will make good use of the cache, and we can process the characters directly without having to chase pointers.

That article describes the ByteString type as a "list of Word8 objects", but that's a bit misleading. Firstly we often say "list" to mean a chain-of-pointers like the first diagram above, e.g. that's exactly what Haskell's built-in list type is, and Haskell's String is such a list. ByteStrings store their Word8 objects more like the second diagram, so it would be better to call them an "array of Word8 objects". Secondly, ByteStrings use a clever three-layered structure to speed up common operations:

- The raw Word8 characters are stored in arrays, like the C-style string above (but without the NUL byte at the end)

- These arrays are pointed to by values called "chunks". These are "fat pointers", i.e. they also store a length and offset value.

- A ByteString itself is a list of chunks (like the first String diagram, but instead of values like 'a', 'b' and 'c' the values are chunks).

This makes ByteString really fast, for three reasons:

- Some operations can be performed by just fiddling at the list-of-chunks level. For example, to append X and Y, the result is just a list of X's chunks followed by Y's chunks; no need to touch the underlying chunks or arrays.

- Some operations can be performed by just fiddling the length and/or offset of a chunk. For example, incrementing a chunk's offset will chop characters off the start (they're still stored in memory, but they will be ignored); decrementing a chunk's length will chop characters off the end.

- The same underlying Word8 arrays can be pointed to by many different chunks in many different ByteStrings; i.e. we rarely need to copy the actual characters around.

As an example, let's say we read the ByteString "[\"foo\", \"bar\"]" (without the backslashes), we parse it as JSON to get a list of ByteStrings ["foo", "bar"], then we concatenate that list to get the single ByteString "foobar", here's how it might look in memory:

                         BS--+---+   BS--+---+
    "foobar":            | * | *---->| * | *---->[]
                         +-|-+---+   +-|-+---+
                           |           |
                      +----+           +----------------+
                      |                                 |
                      |  List+---+      List+---+       |
    ["foo", "bar"]:   |  | * | *------->| * | *---->[]  |
                      |  +-|-+---+      +-|-+---+       |
                      |    |              |             |
                      |    |              |             |
                      |    |              |             |
                      |    V              V             |
                      | BS--+---+       BS--+---+       |
    "foo" and "bar":  | | * | *---->[]  | * | *---->[]  |
                      | +-|-+---+       +-|-+---+       |
                      |   |               |             |
                      |   |               +---+         |
                      |   |                   |         |
                      V   V                   V         V
                     Chunk---+---+           Chunk---+---+
    (chunks)         | 3 | 2 | * |           | 3 | 9 | * |
                     +---+---+-|-+           +---+---+-|-+
                               |                       |
           BS--+---+           |                       |
    Input: | * | *---->[]      |                       |
           +-|-+---+           |                       |
             |                 |                       |
             V                 |                       |
             Chunk+---+---+    |                       |
    (chunk)  | 14 | 0 | * |    |                       |
             +----+---+-|-+    |                       |
                        |      |                       |
                        |      |                       |
                        |      |                       |
                        V      V                       V
            Array---+---+---+---+---+---+---+---+---+---+---+---+---+
    (array) | [ | " | f | o | o | " | , |   | " | b | a | r | " | ] |
            +---+---+---+---+---+---+---+---+---+---+---+---+---+---+

(Note that the exact position where arrows "land" doesn't matter; they're pointing to the entire datastructure they "hit")

Here we can see that there's only one copy of the underlying text; that the substrings 'foo' and 'bar' are simply chunks with length 3, offset by appropriate amounts; and that the resulting 'foobar' ByteString is just a list of these two chunks.

This approach of "store things once, then do all processing with indices" can be very fast (even faster than C in some cases https://chrisdone.com/posts/fast-haskell-c-parsing-xml ). Whilst we can obviously write this sort of algorithm in any language, re-using arrays and chunks like this relies on them being immutable, which is more idiomatic in Haskell. In particular:

- Languages which tend to use mutation aren't well suited to this, since mutating one value can have unforseen consequences to those which are re-using the same components. Copying is safer and more predictable in the face of mutation.

- Languages which favour some built-in interface rather than high-level functions may need to copy values in order to comply with these interfaces. In particular, C's array syntax will work for arrays and chunks, but not for the overall ByteString structure (a list of chunks).

- If we want NUL-terminated arrays, we'll need to make copies when truncating strings, to avoid the NUL byte overwriting the truncated part.

- Re-using values can make it hard to keep track of which parts can be freed. Garbage collection (and other approaches, like linear types, borrow checking, etc.) can make this easier.

The difference between lazy and strict ByteStrings is just whether the overall list-of-chunks is lazy (chunks generated on-demand, useful for streaming) or strict (chunks are generated up-front, closer to using one big array, hence potentially faster and more predictable).

The Text type is just a ByteString paired with a particular encoding method, e.g. 'UTF-8'.

The article you link also talks about fusion, claiming that's why Text (and hence ByteString) is faster, or avoids intermediate allocations. Fusion is great, but care must be taken if we're going to rely on it; e.g. very minor changes to an expression (say, to make debugging easier) can stop things from fusing, greatly changing the compiled code's speed and memory usage.

On the subject of fusion, it's worth noting that Haskell's built-in list type is also subject to fusion, often resulting in zero allocation (e.g. generating a list of characters, then counting them, might compile down to a single loop with a single counter variable, and no Chars/Lists/Strings in sight!). Again, it can be tricky to ensure that this happens. One approach is to test for it with something like https://hackage.haskell.org/package/inspection-testing

klipt · on June 8, 2020

Sounds a bit like the Rope data structure from 1995: https://en.wikipedia.org/wiki/Rope_(data_structure)

Although instead of a list of chunks, a rope is an immutable binary tree of chunks, so more editing operations are efficient, eg insertion or deletion in the middle of the rope.

Abseil library contains an open source C++ implementation of a similar structure called a Cord: https://github.com/abseil/abseil-cpp/blob/master/absl/string...

hajile · on June 9, 2020

I believe both Firefox and Chrome now use Ropes as their string implementation due to those rather desirable editing operations.

boygobbo · on June 8, 2020

Does this approach avoid the issue of substrings preventing the strings they refer to from being garbage collected (e.g. as in Java's substring() before JDK 7)?

chriswarbo · on June 8, 2020

AFAIK ByteString suffers this problem: substrings will prevent their whole array from being GCed.

By default, data read from files, stdin, etc. is split into 64k chunks (e.g. see functions like 'hGetN' in http://hackage.haskell.org/package/bytestring-0.9.2.1/docs/s... ), so only the 64k chunks containing the parts we want will be kept.

There is a function 'copy' which will explicitly return a copy of only the character range that's used, so the original can be GCed ( https://hackage.haskell.org/package/bytestring-0.10.10.0/doc... )

boygobbo · on June 8, 2020

Thanks for the clarification. BTW I agree with tome - your comment above would make a great blog post.

chriswarbo · on June 8, 2020

It's now at http://chriswarbo.net/blog/2020-06-08-haskell_strings.html

hajile · on June 9, 2020

> In fact, the main reason Strings are slow is that their linked-list structure has a lot of overhead, and may be scattered around in memory.

This is almost entirely a myth in the context of Garbage Collected languages. Most popular GC'd languages use a copying and compacting garbage collector. They should follow the pointers when copying and put the list together in one place. Furthermore, if the compiler were to mark the list as contiguous memory region of the same type (data/data pointer and one pointer to the next link), it could jump directly to the Nth element with the same performance as a real array (assuming you do bounds checking). The only significant difference is when you get long lists and due to the doubled size, you can't fit them in a cache line, but that's not a huge issue for most lists or arrays. If cache lines are actually that much of an issue for most programs, then b-trees are a much, much larger issue as they tend to hold more data for longer and necessarily can't be lined up in access order.

I believe another factor is too much dynamic experience. With a dynamic language like lisp, list items generally won't be carrying their own types. Instead, the list will be pointers to a boxed value where the box contains type info. Implementing in a language like Haskell with the requirement of static types should (in theory anyway) decrease the need for unboxing and radically improve performance (I believe common lisp type hints also reduce the need to box everything).

It is also possible to have a linked list pattern, but not a linked list implementation. Javascript is a good example here. You can add/remove at both ends and even splice to add/remove in the middle too. For years, everyone implemented all the lists as linked lists. Today, they mostly use arrays with fill pointers. Add/removing at the beginning and middle takes a performance hit, but the much more common pushing to the end does not (until you exceed the array size). You can add a start pointer offset to an array and decrease penalties for pushing the beginning too.

chriswarbo · on June 12, 2020

You're right that there are higher-order effects at play, but I was careful to hedge my bets (e.g. the sentence you quoted says "may be").

Besides a GC re-arranging/compacting things, the elephant in the room for Haskell is laziness, which introduces another level of indirection. I tactfully avoided this by limiting what I said to "fully evaluated" strings.

Lazy values start out as "thunks" (effectively zero-argument closures); if their value is needed (e.g. if we're branching on it), the thunk is evaluated to "weak head normal form" (i.e. until we reach a constructor we can branch on); the thunk in memory is replaced by this evaluated version, so it won't need to be re-evaluated if it's accessed again.

There are a few ways this can complicate things: thunks can prevent values from being garbage collected; layers upon layers of thunks can cause "space leaks" (where the order we evaluate things is correct, but inefficient; not to be confused with "memory leaks", which is where incorrect memory usage prevents it ever being freed); evaluating in order of demand can jump around inside different values, causing their allocations to be interleaved, and hence fragmenting their memory usage (until a compacting GC is able to defragment them).

As an extra complication, Haskell compilers like GHC will perform "strictness analysis" to find values whose evaluation is inevitable, and hence can be evaluated immediately rather than making a thunk. Strictness annotations can also be added to the code.

As far as types and boxing goes, GHC erases types at compile time, so values aren't technically boxed. However, most values will have several levels of nested structure, due to the way their types are defined, which is amounts to the same thing. For example, in GHC the default 'Int' type is a wrapper around a machine integer, whose type is 'Int#', e.g.

    data Int = I# Int#

In other words an 'Int' value like '42' is not a boxed pair like '(IntType, 42)', like in Lisp or Python; instead it's just a bare, raw 'Int' value, but that value will be still involve a wrapper pointing at a machine word, like 'I# 42#'. For this reason, the #-sufficed types are known as "unboxed".

We can write our code in terms of unboxed types but, since they're always strict, it's less flexible than using the normal "boxed" types (e.g. many common library functions won't work for boxed types). Hence we usually rely on the compiler to unbox values for us.

In practice, the biggest effect on space and time usage will be whether or not fusion is triggered. After that, I'd guess it's the difference between ByteString's fiddling-with-indices/chunks versus String's single-character-at-a-time rearrangement of pointers (even if they're reasonably contiguous).

Regarding the pattern/implementation distinction, I also agree. I was originally going to say something about "view patterns", which we can use to pattern-match on ByteStrings as if they were linked-lists. AFAIK Clojure does this too, using arrays of a reasonable length to implement its cons-based lists.

One tricky aspect of that is how we treat polymorphism. One of the only redeeming features of Haskell's default String type is that we can use it with any generic list functions we like; if we want to change the underlying representation of String, we would need to change the way all of our linked lists work, which might be less ideal for large element types. There are ways to avoid this, but they introduce complications; e.g. writing against "any Traversable, Foldable Functor", rather than just lists. That's more generic, but can also be more confusing (especially for newcomers)!

jimbokun · on June 8, 2020

This is a very nice write up, maybe worthy of becoming a blog post!

tome · on June 8, 2020

Great write up! It deserves to be put in a blog post of its own!

chriswarbo · on June 8, 2020

Heh, I tend to use HN comments as inspiration for writing blog posts. This one's now at http://chriswarbo.net/blog/2020-06-08-haskell_strings.html

pdimitar · on June 8, 2020

Sure, same as Erlang has "binaries" (which can be C strings or literally any binary data as you said) and Elixir built on top of that to give us UTF-8 strings. But, you know, only these two. No more.

I quite dislike when languages start choking me with analysis paralysis. There have to exist sensible defaults!

ghostwriter · on June 8, 2020

Erlang is not lazy by default. If it were you'd like to have at least two more types being generally available in its standard library.

_fq4v · on June 8, 2020

> (several types of them).

I find this criticism very bizarre. C++ and C have several different string types, as does python. This comes down to the fact that what programmers think are strings are often not really and vice versa.

To be clear, a 'string' is a sequence of characters or glyphs meant for human consumption or that are produced by a human as input. A string is not a sequence of bytes. C++ and C get this really wrong, and we pretend char* or std::string is actually a string, but they're not -- they're sequences of bytes. The best string in C is wchar* and the best string in C++ is std::wstring or even better a string from the ICU library. std::string completely glosses over the various encodings of real strings.

Python gets the string situation much better. str is for strings, and bytes is for bytes. str can contain actual human characters, and bytes can't (unless you encode a str to bytes properly).

From this perspective, Haskell really has only one string type: Text. ByteString is actually just a sequence of bytes, and it's an accident of history that Roman characters fit into bytes.

String is an iterator over human readable characters, akin to str in python or std::wstring::iterator in C++.

pdimitar · on June 8, 2020

You might find this a bizarre criticism but In Java, Ruby, Elixir, Rust strings are all Unicode with escape hatches for manually handling byte arrays (that could be C strings).

In that regard I still find it extremely puzzling that some languages / runtimes still struggle with only giving you two choices: byte arrays and UTF-8 strings. I really don't know what's so hard about it. As mentioned in another comment in this [now pretty big] sub-thread, this is one of the things I found off-putting in OCaml which is otherwise an excellent language.

As for Haskell, it has been about 2 years since I last looked at it. I might be spoiled by the languages I enumerated but I just found the docs [at the time] lacked proper nuance to give you a way to immediately make a choice. Dunno. It's not only the strings though; Haskell's community focuses so much on certain aspects that basic ergonomics get neglected.

I am willing to retry getting into it one day. Just shared my general impressions which are very likely outdated.

_fq4v · on June 8, 2020

> You might find this a bizarre criticism but In Java, Ruby, Elixir, Rust strings are all Unicode with escape hatches for manually handling byte arrays (that could be C strings).

In Java, there is a 'String' class that is used for strings most of the time, but there is also char[], which would be confusing to many C programmers. It's only familiarity and ubiquity that make this choice 'obvious' for most devs

> In that regard I still find it extremely puzzling that some languages / runtimes still struggle with only giving you two choices: byte arrays and UTF-8 strings.

To be clear, Haskell's strings and Text are not UTF-8. Thy are containers that can hold any unicode character, but the particular way in which they're stored is not really guaranteed. I believe Text is UTF-16 internally, and String, being based on Char, uses full UTF-32.

EdwardDiego · on June 8, 2020

> In Java, there is a 'String' class that is used for strings most of the time

Nearly all the time. Only time you're likely to see a char[] is in high performance code needs to mutate strings in a space efficient manner.

I would also mention CharSequence, an interface implemented by several string like classes.

lmm · on June 9, 2020

> You might find this a bizarre criticism but In Java, Ruby, Elixir, Rust strings are all Unicode with escape hatches for manually handling byte arrays (that could be C strings).

Java's strings have noticeable misbehaviour for astral characters, and since the language is built around a single hardcoded string type, almost all Java programs inherit this. Ruby had to have a major migration to fix its string support. Rust is much, much younger than Haskell (so it hasn't had to go through a migration yet) and is famous for having at least 6 different string types.

tome · on June 9, 2020

Astral characters?

lmm · on June 9, 2020

Unicode characters that lie outside the Basic Multilingual Plane.

tome · on June 8, 2020

Another great comment on this thread which could be a blog post of its own :)

morty_s · on June 8, 2020

> complexity in tooling and unnecessary tension when working with libraries

Yep. I’m currently feeling this. I’m thankful I’m not a complete beginner and thankful for the Haskellers that blog their opinionated experiences. But this is such a roundabout way of learning.

It’s also almost unfair to compare things to Rust’s Cargo (and Elixir’s Mix). They set the bar pretty high IMO.

Haskell would be more widely used if the ecosystem was more cohesive (perhaps a bit more centralized).

I’ve been enjoying Haskell enough that I’d like to help improve things.

tome · on June 8, 2020

> Haskell would be more widely used if the ecosystem was more cohesive (perhaps a bit more centralized).

> I’ve been enjoying Haskell enough that I’d like to help improve things.

I am hatching an idea along these lines. How do I get in touch with you? Alternatively can email me at the address here if you like: http://web.jaguarpaw.co.uk/~tom/contact/

hopia · on June 9, 2020

Gonna write anything publicly about whatever you're working on? If I'm correct, you've already made Opaleye which is really great, I'd be interested to hear if you got a new Haskell project coming up.

tome · on June 9, 2020

Thanks, glad you like Opaleye! I would like to start a project to improve programmer experience in Haskell. It's not ready to go public yet but if you want to know more then feel free to email me: http://web.jaguarpaw.co.uk/~tom/contact

rednum · on June 8, 2020

As someone who stopped following Haskell world few years ago, this was quite interesting read. I wonder what is the state of ecosystem (libraries) now: a few years ago one of my friends complained that for most basic tasks there is some library but usually half-working and abandoned.

However: I think the point about monads is fundamentally misguided. Monad is an abstract concept and trying to explain it in non-technical way (as linked from the post) will always fail. It makes as much sense as explaining time signatures in non-musical terms, or a Banach space in non-mathematical terms. I've seen this sentiment recurring every now and then; it feels like some people really want to avoid formal definitions for the sake of it, even when formal definitions make things easier. When it comes to monads, I think you can't get any better than the semi-formal definition (typeclass with return and join) plus some motivating examples. For me looking at Reader/Writer/State helped get this; I agree with OP that List is kinda weird.

mcguire · on June 8, 2020

Trying to explain monads is misguided. The formal definitions do not help practical understanding (at least without a strong background in category theory, maybe?).

It's better to simply use them until you get the idea: the IO monad is the same sequencing operation as in any traditional programming language, state monads are slightly weaker versions of the same, maybe and error monads are short-circuiting evaluation, and list is much less weird once you get that (and see the "Turn your failures into a list of successes" paper), and by that time you're probably deep into monad transformer territory.

Once you get comfortable with those, you're good to go. Unless the Haskellers have created a bunch of new nifty abstractions to learn.

hopia · on June 8, 2020

My experience is that the most basic tasks have a very mature library base in my context (web domain). Such as typed web endpoints, web servers and clients, JSON parsing and working with databases. I expect some domains are even far more mature with Haskell.

However, when you explore more specific things like shipping emails via a 3rd party service, various authentication protocols, monitoring via external services, spinning up persistent job queues etc. you start running into odd problems that you need to debug and even possibly fix at library level. That may considerably slow down development.

SeanDav · on June 8, 2020

You seem to be disagreeing with the OP about how to explain Monads, yet the link provided by the OP is referred to as a "non-explanation".

It is true that Monads are a pain point in Haskell. They are extremely powerful but not easy to grok, even by some very smart programmers.

Quote from main article about Monads:

> "Monads.....Despite many attempts though, I don’t think we have yet found the best way of explaining what they actually are. "

and

> " The Wikipedia article has quite a good example of non-explanation."

rednum · on June 8, 2020

To rephrase what I mean: I disagree with the idea that we can explain all technical concepts in non-technical way, which OP seems to support. I don't think we can get much better with teaching monads than providing actual formal definition plus motivating examples. I simply believe monad is too abstract and we can't have a nice analogy similar to the one OP provides ("Functors can be broadly explained in terms of containers..."). In my opinion the "platonic ideal of a monad tutorial" is more or less the (semi) formal definition of monad + some motivating examples. This is also something that cannot be understood if you don't have proper background (which is understanding simpler typeclasses like monoid and functor).

I also would not call monads a pain point to be fair.

mumblemumble · on June 8, 2020

I would turn that on its head: Provide several motivating examples, together with concrete solutions. Once you've provided a couple of those, then you can call attention to the commonalities that unite all those solutions. At which point, the abstract explanation is almost trivial.

For example: My understanding of monads ultimately came together rather quickly and painlessly. After I had initially just given up on the whole thing. And then I watched Scott Wlaschin's Railway Oriented Programming talk (https://fsharpforfunandprofit.com/rop/). And then I learned what's really going on with LINQ's query DSL and SelectMany. And had someone show me how to easily compose Optional types without having to forever be explicitly interrogating whether they have a value or not. With all that under my belt, the pattern suddenly became rather obvious.

What did not help me in any way was any article that explicitly sets out to try and explain monads. They are all, as far as I'm aware, guilty of tring to sow the seeds before tilling the earth.

lkuty · on June 8, 2020

It was quite similar for me. Scott Wlaschin's Railway Oriented Programming and its other articles on its site (and in particular https://fsharpforfunandprofit.com/series/map-and-bind-and-ap...) helped me greatly too. I liked its idea of an "elevated world". Another helpful source was/is https://wiki.haskell.org/All_About_Monads. But I thought a lot about them for the past few years. I tried also to make a detour through category theory but it did not help except for getting a lot of books in my library.

Another helpful article was Philip Wadler's "Monads for Functional Programming" (http://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/ba...).

jackcviers3 · on June 8, 2020

A monad is an abstract interface for sequencing side effecting operations in a way that guarantees referential transparency. They do this by hiding the extraction of the value contained in the monad and passing that value to a user supplied lambda that takes a value of the type of the value contained in the monad and returning a new monad containing a different type. Thus, you cannot do step one after step two, because step one must extract the value from its containing monad.

In a modern OOP language, like Java, this is analogous to using for each in to build a new list from a given input list. Due to the abstraction, a monad can do this without exposing the internal loop state to the outside world, a single bind expression safe to refactor, whereas moving the for loop requires moving two statements in order to be safe to move.

gavinray · on June 8, 2020

What is "referential transparency"?

> "by hiding the extraction of the value contained in the monad and passing that value to a user supplied lambda that takes a value of the type of the value contained in the monad and returning a new monad containing a different type"

I'm not sure I follow. Is this like saying I have a function:

    let myFunc = (arg1: SomeType): NewType => {}

And it takes SomeType, and returns NewType? What does it mean to "hide the extraction of the value"?

gmfawcett · on June 8, 2020

I wrote a Reddit post a while back that might help. (The parent comment has since been deleted -- the poster was asking whether using monads in Haskell was kind of like "dependency injection", i.e., passing in all the information that a function might need, as arguments to the function.)

https://www.reddit.com/r/programming/comments/593ud7/a_taste...

jackcviers3 · on June 8, 2020

The reply by louthy is great, but it doesn't answer your first question:

> What is referential transparency?

A function reference that can be replaced at all call sites in a program with its body with all of the references to its arguments replaced with references to the call site arguments without the observable behavior of the program changing is referentially transparent.

In practice, it means that you can safely refactor lines of code into a function reference without worrying about breaking your program.

It is trivial to ensure that a function that does no side effects, like `addTwoNumbers(a, b)` is referentially transperent, and we do it all the time, and we have very little trouble reasoning about what functions like that will behave like at runtime, so we do it as much as possible anyway. We call it things like "single responsibility principle" and KISS, and it leads to more maintainable programs.

However, it isn't as easy to do with a function like `printStringLine(aMessage)` or `divideTwoNumbers(a, b)`. Obviously, if you move your print statements and the state of a program changes, you'll get different output depending on where you moved them to. And the behavior of divideTwoNumbers when the dividend is 0 is problematic. As is making a network call --> depending on when it is called, you might get a different answer. All of those things are things the programmer has little to no control over.

The monad interface provides a way to make those things referentially transparent, often storing the transformations made in a sequence of bind calls in a data structure, then interpreting that data structure via a method called `unsafeRun`. Since the return of bind is now a data structure containing the transformations to be applied, it is possible to test that the network calls, console output, and error raising and throwing logic outputs in the correct order by substituting a "logging" implementation of the monad in tests rather than an "executing" monad in tests. Then you can test the "executing" monad by sequencing two operations that produce output and checking that that output matches your expectations, rather than checking that all of the network calls in your application actually work at test time.

Additionally, since we can usually unit test our non-side-effecting functions easily, using a monad method is a type-safe, code-level indicator in code that "trouble out of our control at runtime may happen HERE." So if you have trouble, and you are making a network call in your bind/flatMap, you know where to look: in your NetworkCallMonad implementation.

Referential transparency allows us to reason about a program's behavior by just looking at the body of the function, rather than anything else around it. It's the ultimate form of encapsulation, and has the same benefits. So, now that we have a way to define referentially transparent side effects that encapsulate problems that can only happen when the stars align incorrectly, we can reason locally and test locally around all the things in the program, which means the program is simpler to reason about.

There are some consequences, though. ANY 2 Monads won't compose. List<Option<Int>>.bind((i) => i + 1) won't compile, or work. You need a MonadListTransformer<Option, Int> that knows which order the monads are composed (Does Option contain lists, or does the list contain options?) in order to do the extraction and flattening in the correct order. Of course, this adds runtime execution overhead, and in the case of two nested monads is not too difficult to acheive.

But often, when you encode a program's effects into monads, you end up with monstrosities: <List<NetworkCall<Option<Result<JsonObject>>>>>. That can get to be a mouthful, and our simple little bind definition is now several layers deep. All that means is that you need to add some methods to your Monad interface, and give them new names, so that you have a type that is a Monad and a IO and a Monoid and a ApplicativeError. Those may sound like gobbledygook names, but they are the names chosen by mathemeticians and the FP community, so to Google them you have to use the names.

Monads, and `TypeClasses` are not the only way to achieve RT programs, but they are pretty widespread, well-defined, well-tested, and well-documented interfaces that will work. In languages without higher-kinded-generics (Generics that can hold other generics), you can simulate them in any language with generics using `Box<Repr>` and `Unbox<Repr>` -- typescript does this in its fp lib, for example -- so that you can still get the code reuse out of defining simple instances that can be derived from things higher in the dependency tree.

Anyway, they have value for simplifying programs, but all of their value is derived by maintaining referential transparency throughout the entire codebase as much as possible, and by staying within a context unless there is a safe way to exit the monadic context (This is called a `Comonad` -- at allows you to get a value out of a Monad without risking behavioral changes in your application, and not all `Monads` have a corresponding `Comonad`) until the very last line of your main file. Organizing many custom monadic effects is a common concern, and a really good solution for that is to use a single monadic type that can handle all of your effect needs -- like the IO monad in haskell -- and using the good `ol interpreter pattern to implement methods that delegate to the one base monad to do their effectful work. This is called tagless final style.

One consequence is that you do a lot of wrapping and unwrapping in your code with monads. It gets tedious to call the same constructor over and over again. Haskell, and other languages with extension methods, make this easier by automatically lifting call sites into the currently used Monad context type implicitly. Otherwise, you have to call new Monad(new Monad(new MyNetworkCall("myApiAddress")).map((networkCaller) => networkCaller.callGet(1))).flatMap((result) => new MyNetworkCall("myOtherApiAddress").map((netWorkCaller) => networkCaller.post(result.user)))

a lot, which is safe, but tedious to write and read. If you have to do this because of language limitations, refactor and extract as much as possible so that your code is readable AND safe.

Obviously, if your language or a library in your language doesn't define the standard typeclasses, of which Monad is only one interface, defining them and using them can be a real pain. Using them is kind of like using any framework --> trivial programs that are small should not use monads unless screwing up at runtime is VERY expensive. A lot of business applications are neither small nor trivial, and screwing up can have dire consequences, so a lot of programs can benefit from this level of encapsulation. It's just another tool, a proven (in the mathematical sense) tool, in your toolbox that can help code quality. Referential transparency isn't a magic silver bullet or a panacea. You still have to think.

> I'm not sure I follow. Is this like saying I have a function: > let myFunc = (arg1: SomeType): NewType => {}

Not quite.

    let myFunc = <A,B>(arg1: SomeType<A>): SomeType<B> => {}

The `SomeType` wrapping the `A` is returned as a new instance of `SomeType` that now contains a `B` and not an `A`. The argument `arg1` is not mutated. The `A` it holds is extracted, transformed through user code into a `B`, and then placed back within a new instance of `SomeType`.

This obviously happens a lot in any language with generics. Like, all the time. And, every generic will implement the way it does the extraction differently, but the transformation always gets applied to the extracted value(s) in the same way, by calling the user function on the extracted value(s), then ensuring that if multiple instances are created (because multiple values were extracted) that only one instance of the generic that contains all the transformed values is returned.

Here's a simple example of it in use:

    let getCharsFromStrings(strings: List<String>): List<Char> = bind(strings)((s: String) => s.map((c:Char) => c)))
    getCharsFromStrings(List("one","two")) // outputs List<Char>('o','n','e','t','w','o')

Since each string is broken into a list of its characters, you might have expected the return to be a list of two lists of characters. A Monad does the flattening for you. In fact, in some languages it is called `flatMap`, because it applies you function to each element that is extracted and flattens the nested structure by one level, turning your list of lists of chars into a list of chars via concatenation.

gavinray · on June 8, 2020

Monad is really just "flatMap"?

Is this basically it then:

    interface Monad  {
      flatMap: <A,B>(arg: Some<A>) => Maybe<B>
    }


    let userAges: Monad.flatMap = flatMap<User, number>(users, user => user.age)

lmm · on June 9, 2020

That signature looks a little confused - you're saying "Monad" but you're talking about the specific case of Some and Maybe. The interface looks something like:

    interface Monad<F<?>> {
      fun flatMap<A, B>(argument: F<A>, function: A -> F<B>): F<B>
      // also need pure/point here but ignore that for now
    }

and then a specific implementation for e.g. Option looks like

    singleton OptionMonad implements Monad<Option>
      override fun flatMap<A, B>(argument: Option<A>, function: A -> Option<B>): Option<B> =
        when(argument) {
          is Some -> function(argument.value)
          is None -> None
        }

whereas you have implementations for other types in terms of those specific types too:

    singleton FutureMonad implements Monad<Future> {
      override fun flatMap<A, B>(argument: Future<A>, function: A -> Future<B>): Future<B> = ...
    }

The tricky part is that the type parameter F is itself a parameterised type, so it's a slightly higher level of abstraction than most interfaces (and one that most languages don't support).

jackcviers3 · on June 8, 2020

Yep. You got it. Monads are compared with value equality, not reference equality below. The flatMap behavior has to obey some laws:

The Associative Law states that flatMapping two functions in sequential order is the same as flatMapping over a composition of those two functions:

Monad(fa).flatMap(f).flatMap(g) valueEquals Monad(fa).flatMap(g(f))

It also has a method called return/point/pure that comes from the inherited Applicative interface:

   interface Applicative<F<?>> {
     pure: <A>(arg: A): F<A>
   }

That return/pure/point followed by flatMap has to obey the Left Identity law in a monad for it to be a valid monad:

  Monad(fa).pure(x).flatMap(f) valueEquals f(x)

The Right Identity law states that if we have a Monadic value and flatMap over the point/pure/return method we get a new monad that has value equal to the original monad:

m = Monad(fa) Monad(fa).flatMap( (v) => Monad(fa).pure(v)) valueEquals m

Those are your unit tests that you have to perform on your monad instances. If they pass, your implementation is correct, and you can rely upon it.

These laws are so that the monad interface can be composed/extended correctly with other referentially transparent interfaces, like Functor -provides map, the map must be associative, like flatMap); Apply extends Functor - provides ap, or applyParallel that allows parallel mapping over collections that can be safely parallelized. ap has to be consistent with flatMap when an implementation has a monad instance -- that is if you map over a list in parallel, the list you get out should be the same as if you flatMapped over that same list using the same function, but wrapping List() around the result, which is sequential; Applicative extends Apply - provides pure/point/return, which just constructs instances that are Applicatives from a given value; Semigroup - provides combine/concat/++, allowing two items of the same type to be combined into a new value of that type; There may be many instances for a type of semigroup (Boolean has && and || for example); Monoid extends Semigroup - provides empty, which is a value for a type when combined with another value for that type results in the other value (0 in integer addition, for example, an empty list in list concatenation, for another); traverse - provides sequence, which flips the types of wrapped values (Maybe<List<T>> becomes List<Maybe<T>> when sequence is called), traverse<G<?>,F<?>, A, B>(ga: G<A>)(f: A => F<B>): F<G<B>> -- used to define sequence, foldRight, foldMap, foldLeft, etc; And ApplicativeError - provides raiseError<F<?>, A,ERRORTYPE>(t: ERRORTYPE): F<A> and handleError<F<?>, A>(fa: F<A>)(f: ERRORTYPE => A) and catchNonFatal<F<?>, ERRORTYPE, A>(f: () => A):F<A> -- obviously these only work if your type has two members, one for success, and one for failure.

louthy · on June 8, 2020

> I'm not sure I follow

Imagine you have a context (the monad), which contains a value of type `a`...

> "by hiding the extraction of the value contained in the monad...

Means, you don't need to care about how we get the value out of the context, it will just be done for you. That's part of the monad's job. Each different type of monad may do it differently (a List will do it many times, an Option will do it zero or one times, Async promises to give you a value, etc. ).

> "... and passing that value to a user supplied lambda that takes a value of the type of the value contained in the monad ..."

So, we have a function called `bind`, which is part of the interface for monads ... it takes the original context (monad) as well as a lambda as arguments.

Below is the signature: `m a` which is the original monad value, (a → m b) is the lambda, and `m b` is the return value, which is a monad with its contextual value type changed from `a` to `b`

    bind :: m a → (a → m b) → m b

Where you see `m a` and `m b`, they're generic types which are parameterisable on both the inner and outer type, the inner type `a` is fairly usual, but the outer-type `m` is a higher-kind and can also be parameterised, `m a` could become `List a` (which is like List<A> in C# or Java).

So, if you squint, you might be able to see that the `a` comes from within the `m a` context, gets passed to the lambda, which needs an `a` argument and returns an `m b`, then that `m b` is returned as the final result.

Each type of monad has its own implementation of `bind`. So, for a List the bind operation looks like this:

    bind :: List a → (a → List b) → List b

And so you should see that for Lists each `a` will be extracted from the first argument (so we'll have zero or many `a` values), each `a` is passed to the lambda which will return a `List b` (and so we have zero or many `List b` values), and those lists of `b` will be concatenated to give a flattened result of `List b`. And so, that's the behaviour if the List monad. Each monad will have its own rules, but must follow the signature of `bind`.

> "...and returning a new monad containing a different type..."

This just means the `a` turns to a `b` (but the `m` stays the same). Essentially it means you can do some kind of mapping of the contextual value(s) to return a new type. So, if you were to map Ints to Strings the bind signature would look like so:

    bind :: List Int → (Int → List String) → List String

In practice you can see how the lambdas returning `m b` allow for chaining of computations. The code below visits the items in two lists (listA and listB) and sums them. The lambdas close over each other to create a context for the `return` at the end of the process. Which is very close to how scoping works in imperative languages.

    bind listA (\a → 
        bind listB (\b → 
            return (a + b)))

The key to any monad is its bind function, that's where its behaviour is. But fundamentally, it's a contract, or interface which forces ordering of computation... the values returned by the lambda (which are monads themselves), clearly can't be evaluated until you evaluate the first argument to get the `a`.

And so, `m a` must run before `m b` ... combine that with `do` notation and you'll get something that looks and feels a lot like imperative code, but all of the side-effects are encapsulated and declared, and all of the logic is pure (referentially transparent).

The example above with `do` notation is:

    a ← listA
    b ← listB
    return (a + b)

Which is essentially equivalent to a nested for loop without the ceremony.

johnday · on June 8, 2020

This is a good explanation of how the Monad typeclass maps onto data functors, but I'm not sure that it translates well onto Control functors such as (-> r).

louthy · on June 8, 2020

I was mostly just trying to clarify and unpack the GP's quite terse statement. Definitely not trying to write a monad tutorial ;)

johnday · on June 8, 2020

Oh, sure! In that case, mission accomplished with flying colours :)

danielscrubs · on June 8, 2020

A sidenote: how great would it be if someday an abounded, well written, non buggy piece of code could just stand the test of time like math does?

Instead we throw projects and programming languages faster than I change my wardrobe. :)

abjKT26nO8 · on June 8, 2020

The maths of today has very little to do with the maths of 500 years ago, let alone 2500 years ago. Sure, we still have the Pythagorean theorem, but in today's modern formulation it's trivial. We still have numbers, but they are no longer something that can be totally ordered. With numbers we even turned things upside down, because we used to first try to define single numbers and only then collections of numbers. These days we say that "a number is an element of a number field", "a vector is an element of a vector space" etc. and we define the structures first. But the biggest departure probably is the widespread reliance on proof. In XVI century you could call yourself a mathematician without proving things, as long as the ideas you came up with worked in practice.

I'm sure that in the programming world we'll still have "if"s and variables/aliases 200 years from now. And that's right about how much commonalities modern mathematics has with the mathematics of the age of Euclid.

Disclaimer: I took a course in history of mathematics and I have to say, if I was born as little as 200 years earlier I wouldn't be able to stand mathematics. These days, it's actually interesting to me. Similarly to how mathematics in primary through high school is really boring.

danielscrubs · on June 9, 2020

I mean conditions exist in math already and I'd argue that there is a bigger chance qwerty is still the default keyboard layout than your Python 3 program working in 500 years.

How many of the popular languages even have formal specifications to use as a reference when the next CPU architecture comes along, or the next ABI?

The tools exist but I don't see people rushing to Modula-2 for example, which has.

It does remind me a bit of this paper "Boosting Vector Calculus with the Graphical Notation" [1], but we instead have over ten thousand only partially defined ones.

But I can agree that my view of math is a bit too rosy.

[1]: https://arxiv.org/abs/1911.00892

ansible · on June 8, 2020

That's what initially drew me to Haskell, and how it seemed closer to math than most other programming languages.

Lazy evaluation brings this to the fore, especially with needing the programmer to make decisions about the sequencing of operations.

Consider these two statements:

  x = a + 5;
  y = b + 3;

So these statements describe the relationship between 'a' and 'x', and 'b' and 'y'. Great. With most programming languages, the programmer also must choose the sequencing of these operations. When they should occur, and in what order. With Haskell, this goes away.

The compiler / runtime system decides when (or if) the values of 'x' and 'y' will be evaluated. With a really, really smart compiler / runtime system, this can lead to great optimizations a C programmer would need to implement by hand when porting code to a new platform with vastly different performance characteristics.

Now... in practice... I found Haskell hard to get comfortable with. I'm quite happy with Rust these days though.

ashtonkem · on June 8, 2020

I fully grok monads. I’ve done enough reading and usage of them to understand them.

They are not a useful abstraction for programming. They’re mathematically correct, but that’s not the same as what an industrial engineer needs.

The big warning sign is monad transformers. Alone, monads are totally fine, but the issue is that you rarely want one. So you end up with this unwieldy tower of transformers that would make an enterprise Java engineer worried.

lmm · on June 9, 2020

As an industrial engineer, monads are the best solution I've ever found to the cross-cutting concern problem. I need to be able to put custom cross-cutting concerns on my functions - things like must-happen-in-database-transaction, record-in-audit-log, must-have-this-authorisation-level, record-these-statistics. If your language doesn't have monads, you end up using reflective proxies, metaclasses, decorators, bytecode modification, macros, or something equally incomprehensible.

I'd love to have a better alternative. Maybe one day one of these "effect systems" will actually get a production-quality implementation. But until then monads are the least-bad option out of everything that I've seen tried.

johnday · on June 9, 2020

Absolutely. It's worth noting that every single effect system implementation that currently exists also leverages monads at the user level, even if they're doing exciting type-theoretic things underneath.

johnday · on June 8, 2020

I firmly believe that monads (and monad transformers) are exactly what an industrial engineer needs, for the following important reason.

A monad describes its scope in such a way that writing code outside of its scope is a compile time error. If your code needs certain capabilities, it must invoke the computational context of those capabilities (which is usually a monad in Haskell). If it doesn't, then the context can (and should) be omitted.

It's worth noting that monad transformers are themselves monads. So drawing a distinction between "monads" and "monad transformers" to say that one is good and one bad is not very meaningful. The composability of monads, as exemplified in MTL, Transformers and similar, is a positive sign of their power and not a red flag.

If your code ends up with an "unwieldy tower" of transformers then that's a strong indication that the principle of separation of concerns has not been adequately followed. The fact that Haskell makes that evident seems to me like a benefit.

Chris_Newton · on June 8, 2020

I’ve been thinking recently about how monads in functional programming are analogous in some sense to inheritance in class-based object oriented programming. Each abstracts a useful pattern and lets the programmer write a certain type of code quite elegantly. Each is also profoundly limited by being based on a single type(class), i.e., the monadic structure or the base class/interface.

If you want to use multiple instances of the same pattern in the same place, you have to start making design choices about how to relate them. For example, you might nest monad transformers in a certain order. You might define a class hierarchy.

Sometimes the correct design will be dictated by the context. However, sometimes it won’t, and then you’ll have to make what is essentially an arbitrary decision at the time, yet one that may be expensive to change or adapt to if it turns out to be inconvenient later.

And finally, as long as everything stays nice and “linear” nothing too bad seems to happen, but in realistic programs you might end up needing to combine multiple monadic effects or wanting your class to inherit from multiple base hierarchies. That is often when awkward edge cases start creeping into the design. Suddenly, instead of the elegant patterns from the glossy brochure, you start seeing ugly and brittle warts like explicit casts to resolve ambiguities in which virtual method should be called or manual lifting through multiple layers of monadic effects.

In each case, the response has been to move away from the original patterns towards something more flexible where the compromises are not as deep. In OOP, composition tends to be favoured over inheritance these days. In languages like Haskell, there is a lot of interest in effect systems that could offer a less rigid way to combine monadic effects than monad transformer stacks.

misja111 · on June 8, 2020

Even if you have separated concerns by having method 1 return Monad transformer A and method 2 return transformer B, you will still have to combine the results of both your methods at some point.

lmm · on June 9, 2020

Sure. But if you're doing it right, that combination happens in a place whose sole responsibility is doing that combination. The only case where you have to carefully interleave A and B is if your business logic really does involve carefully interleaving two disparate effects, and in that case the complexity is genuinely there in the domain and it's good to surface it explicitly (or else you're modelling it wrong).

mcguire · on June 8, 2020

It might be nice to be able to compose monads some other way than transformers, but I don't know what that would be. Transformers are unwieldy, yes, but they're also completely trustworthy. If you try to assemble them randomly, without thinking about it, yes, you will have a bad time, in much the same way that you will have a bad time if you try building any abstraction without thinking about it.

Yes, they can be a useful abstraction (just like anything else); I have the same sort of feeling about your response as I have about "checked exceptions are evil".

alephu5 · on June 9, 2020

This isn't true at all, promises in JavaScript are monadic, as are chains of conditional logic. Stuff like this is literally everywhere:

if a: b = f(a) if b: c = g(b) ... else: p()

Which as a monad would be something like

A >>= f >>= g >>= h >>= ...

I've built several classes with a covert "return" and "bind" operation to keep this kind of programming clean, I just don't tell anyone it's a monad.

Vosporos · on June 8, 2020

>They are not a useful abstraction for programming. They’re mathematically correct, but that’s not the same as what an industrial engineer needs.

Pray tell then, how do you sequentially compose functions operating on values wrapped in an ADT like Maybe or Either?

lambdasquirrel · on June 8, 2020

You don't. You just do it the dumb old simple way, with some boilerplate here, and some boilerplate there. It may not be elegant, but it's not a blocker. Haskell needs more than this, if it's to find wider adoption.

lmm · on June 9, 2020

Experience shows that it is a blocker. Programmers will resort to virtually anything to avoid that kind of boilerplate - early return, macros, exceptions as a language feature, reflective proxies.

ashtonkem · on June 8, 2020

One monad at a time is fine, but the problem starts when you need multiple. Monad transformers are completely unergonomic.

lambdasquirrel · on June 8, 2020

This is nice, and yet, not much different from what the Haskell community was dealing with 5 years ago. They are also mostly programming concerns, as opposed to engineering concerns.

How do you deal with simple things like exceptions and string interpolation? There are guides trying to explain monads, but no straightforward answer for dealing with these sorts of things, with building separate libraries / packages, setting up your own CI with your own stackage repo. And we haven’t gotten into production yet. The ecosystem doesn’t have these things built out, or, it’s just not documented/agreed upon so you’d effectively be starting from scratch.

While it might be nice to use Haskell for math-y things, in the end, you’d have to be able to instrument it, monitor/log it, etc. That doesn’t appear to be in place either. When I worked with Haskell last, these were all blockers.

kccqzy · on June 8, 2020

> How do you deal with simple things like exceptions and string interpolation?

While I agree string interpolation isn't convenient, exceptions in Haskell are great. You can throw exceptions in pure code (throw upon evaluation), in IO (throw when sequenced), catch them, inspect them by doing type down casting, etc, all without pulling a dependency. It's even better than Rust. (In Rust I had to use the anyhow crate to get back some basic features I thought should be present by default.)

st1ck · on June 9, 2020

Exceptions work rather similar to other languages, except that you can only catch them in IO code. So generally (IMO) it's better to reserve them for actually exceptional cases, and not throw them around like in Python.

String interpolation can be done with external libs using quasiquotations. It's not very pretty this way, so typically string concatenation or printf is used.

Of all the things, math-y is probably not the be best usecase for Haskell, when you can use Python, R, SageMath maybe, or Julia. Haskell is best for writing parsers, compilers, interpreters. Or if you need a rich static type system. If you throw in LiquidHaskell, you have pretty good tooling for writing mostly correct programs.

johnday · on June 8, 2020

As a PhD student, I don't have enough experience working with Haskell in an industrial context to comment on the engineering concerns. I believe you when you say that these things are all problems. However, I'm aware that at least some solutions are underway. For example, the shake library [1] is a nice way to set up competent, reproducible build systems.

[1] https://hackage.haskell.org/package/shake

spekcular · on June 8, 2020

This is a great post. However, there are so many more things that should be on here!

If I could suggest just one, it would be lazy evaluation by default, which makes it exceedingly hard to reason about time and space complexity.

Also relevant is the author's article on problems using Haskell on Arch Linux: https://dixonary.co.uk/cabal-2020

dunefox · on June 8, 2020

That's not a pain point, that's a core part of the language. If you don't want that use Ocaml.

lmm · on June 8, 2020

I want higher-kinded types (heck, I want kind polymorphism) though.

Hopefully Idris will become production-ready and I'll have the best of all worlds, but that's a way away.

dunefox · on June 8, 2020

Isn't there a compiler extension for hkt?

dan-robertson · on June 8, 2020

Not for ocaml

dunefox · on June 8, 2020

I meant for Haskell, did you mean for Ocaml originally?

dan-robertson · on June 8, 2020

If you look at the comment chain, you see:

- I don’t like lazy

- ocaml is like Haskell but strict

- I want higher kinded types

(note this is a feature you can turn on in Haskell but ocaml does not have it[1]. The commenter is implying that ocaml is therefore not a sufficient strict Haskell)

- you say that higher kinded types are an option, which is true for Haskell but not for ocaml which is being proposed.

[1] there are ways to sort-of do higher kinder types in ocaml but they are not ergonomic and feel a bit hacky to me.

saityi · on June 8, 2020

In case anyone is curious about [1], there is a paper called Lightweight higher-kinded polymorphism -- http://ocamllabs.io/higher/lightweight-higher-kinded-polymor... -- that goes into some of the possible encodings.

pdimitar · on June 8, 2020

Then maybe an argument can be made that this should be front-and-center in the docs? (Note that I am not claiming anything though, haven't checked Haskell's docs in a while.)

It's also a surprise to me that you seem to say that OCaml is Haskell without the lazy eval being the default, mind expanding on that? Quite interesting.

wk_end · on June 8, 2020

They're similar but not really the same. The big differences are:

* Haskell is pure whereas Ocaml is not

* Haskell has a great story for parallelism whereas Ocaml does not

* Haskell has ad-hoc polymorphism (via typeclasses) whereas Ocaml does not

* Ocaml has an exceptionally powerful module system whereas Haskell does not

chewxy · on June 8, 2020

I've heard it best as: both Ocaml and Haskell are pure functional languages. Haskell is Extra Virgin.

That said, the notion of purity is a bit BS. At some point in your program you are definitely going to invoke the world-at-large. Whether you do it through IO(), tap your nose and call it "pure" or you do it other way is all down to semantics (not in the PLT sense of the word - the PLT term to use would be "pragmatics"). The core of both Ocaml and Haskell are pure functions.

Also, Ocaml doesn't really need typeclasses. Modules. Modules are everything

sweeneyrod · on June 8, 2020

OCaml is a lot less pure in spirit as well. Not only can you read and write to the outside world without having to jump through hoops, you can also write all your code with pointers, for loops and (mutable) arrays if you so desire.

spekcular · on June 8, 2020

Note that multicore OCaml is almost ready: https://discuss.ocaml.org/t/multicore-ocaml-may-2020-update/...

vkazanov · on June 8, 2020

Oh-h-h-h... I've heard this so many times before :-)

yaantc · on June 8, 2020

This time is different ;) Look at the end of the GP link on multi-core Ocaml progress report: everything under the "OCaml" title is work on the upstream compiler. It's still limited, but the upstreaming process has started.

ghostwriter · on June 8, 2020

Having an SMP compiler/runtime is half of the story, the other half is safe parallelism. I expect another decade is going to be spent on chasing bugs due to mutable data escape hatches in OCaml world.

pdimitar · on June 8, 2020

This is my main worry about Multicore OCaml. You nailed it. There's a lot of legacy in the ecosystem -- which is not a bad thing at all! Many people appreciate stable environments and learn to circumvent their problems. All completely fine and we all do it.

But I worry that by chasing backwards compatibility and not just saying "guys, you need to modernise your libraries to do X and Y, otherwise you will never be Multicore OCaml-friendly" the maintainers will invite a lot of legacy cruft -- which will now be subtly broken -- that they will either have to fix themselves or just concede that it's impossible.

Then again, I suppose they don't want the Python 2 vs. Python 3 split (or Perl 5 vs. Perl 6 / Raiku; EDIT: It's actually "Raku" as a commenter below clarified). But I fear it will happen anyway due to the reasons above.

I very quickly fell in love with OCaml but eventually settled at Rust for these, and a few other reasons. It's an otherwise excellent language and has one of the most underrated compilers on the planet. It's a shame that it's moving so slowly... :(

EDIT: As for safe parallelism, here's what one of the maintainers said soon ago: https://news.ycombinator.com/item?id=22735176

It looks like they prefer to do the lowest level possible so people can build their own abstractions on top. Completely fair though, not criticising.

lizmat · on June 8, 2020

It's actually called Raku (https://raku.org using the #rakulang tag on social media), not Raiku.

pdimitar · on June 8, 2020

Whoops. I was sure I got it wrong. Thanks.

millimeterman · on June 8, 2020

Haskell certainly has a _better_ parallelism story than OCaml but I wouldn't call it great. It's really nothing special.

dunefox · on June 8, 2020

From what I've seen of Ocaml it's similar but with strict evaluation. Then again, it has no support for unicode strings so I never really looked at it again.

But algebraic data types, exhaustive pattern matching, and modules make it the next best thing, I think.

pdimitar · on June 8, 2020

Yeah, one of the things I dislike in it as well. UTF-8 should be built-in, not relied upon by libraries, even if they are commonly used by everyone.

HelloNurse · on June 8, 2020

Lack of well-made strings puts the language as a whole in a bad light: if you don't bother supporting fundamental practical needs, asking me to use your experimental, half-engineered proof of concept is arrogant.

pdimitar · on June 8, 2020

Absolutely. I know many engineers hate the word but this is simply extremely bad marketing.

I get it, you want to work on interesting scientific problems and/or you work for Jane Street (a huge financial company); but the lack of desire to circle back to certain basics and nail them once and for all sends a hostile message to me as a programmer looking to add OCaml to his tool-belt. It tells me "we don't care".

dunefox · on June 8, 2020

I even spent two evenings getting the tools set up so that I can install a library that apparently supports unicode. I got nowhere. The core language might be interesting, but everything around it - including unicode strings - is a mess.

tome · on June 8, 2020

There are well-made string types for Haskell! People are just understandably confused that there are so many of them.

HelloNurse · on June 8, 2020

I'm sure that very smart people have risen to the challenge of supporting strings in Haskell, and it is therefore possible for a sufficiently motivated user to process text decently after all.

But the problem is that strings must be a built-in feature in any programming language that wants to be taken seriously (with an exemption for specialized ones that don't need strings, like GLSL). What would you think of a hotel that doesn't have mattresses but allows guests to bring their own?

tome · on June 8, 2020

Huh, the problem isn't that people have made their own. The string types are all in the standard "boot" packages. The problem is that there is one that is obsolete from another era (String) and four usable ones, each combination of strict/lazy and Unicode/Bytes.

Haskell gets twice as many as Python (say) because of the desire to have strict and lazy versions.

HelloNurse · on June 10, 2020

Languages are judged by their standard library, and standard Haskell strings are part of the problem and not part of the solution.

Multiple moderately bad string types, with strict typing that exacerbates interoperability problems, are worse than one really bad string type (like char pointers in C) or nothing at all.

JoeAltmaier · on June 10, 2020

Can it be worse than Microsoft's hatful of incompatible strings? You change the default string type, you have to change code. Gah.

TheOtherHobbes · on June 8, 2020

Recursive Big O that isn't defined until runtime is not usually considered a good thing.

spekcular · on June 8, 2020

IIRC, even Simon Peyton Jones has stated lazy evaluation was a design mistake.

runeks · on June 8, 2020

SPJ said laziness-by-default was a design mistake. I.e. he’d still want laziness in a future version of Haskell, but not turned on by default.

dunefox · on June 8, 2020

I'm gonna need a source for that.

spekcular · on June 8, 2020

The exact quote and source (powerpoint slides) are in this HN comment: https://news.ycombinator.com/item?id=1924061.

See also the insightful comment of plinkplonk on that thread.

mcguire · on June 8, 2020

That's the "Wearing the Hair Shirt" talk, for those that are familiar.

"Purity is more important than, and quite independent of, laziness

"The next ML will be pure, with effects only via monads. The next Haskell will be strict, but still pure.

"Still unclear exactly how to add laziness to a strict language. For example, do we want a type distinction between (say) a lazy Int and a strict Int?"

From plinkplonk:

"It is through insisting on laziness as default and solving some of the problems encountered (via monads for e.g) that he arrived at the position that "the next Haskell will be strict". He also notes some open problems with adding laziness into a statically typed strict language (slide 40 "Still unclear exactly how to add laziness to a strict language. For example, do we want a type distinction between (say) a lazy Int and a strict Int?")."

tome · on June 8, 2020

> The next ML will be pure, with effects only via monads

Which has already been proven wrong, by the way, by F#, which appeared in 2005 ("Wearing the hair shirt" was from 2003), so I think the comment is not to be taken too literally.

https://www.microsoft.com/en-us/research/publication/wearing...

mcguire · on June 9, 2020

Did Simon Peyton-Jones work on F#?

tome · on June 9, 2020

I don't think so, other than by being a colleague of the main designer.

dunefox · on June 8, 2020

That has to be the worst set of slides I have ever seen. Thanks.

danielscrubs · on June 8, 2020

Comic Sans is arguably an excellent font for learning. It's funny how the world works.

https://www.dictionary.com/e/comic-sans-font/

danielscrubs · on June 8, 2020

From what I understand of the talk:

Laziness enforced purity in the beginning. Now that we collectively know that reasoning about effects is a good thing we can remove laziness but keep that part.

I'd say it was a happy little mistake.

Gormisdomai · on June 8, 2020

Can't it be both?

dunefox · on June 8, 2020

Sure, but I believe that Haskell is great just because it tries stuff that isn't in other languages, like laziness by default. If it's really a pain point you can always switch to Ocaml or others.

logicchains · on June 8, 2020

Seems like an overreaction to suggest someone move to another language when there's a very simple solution to laziness in Haskell: the -XStrict language pragma.

Vosporos · on June 8, 2020

>If it's really a pain point you can always switch -XStrict on

FTFY

hopia · on June 8, 2020

Is that flag really ever smart to switch on? I've heard StrictData makes a lot more sense but turning Strict on prevents the compiler from making some quite crucial optimizations, resulting in degraded performance.

logicchains · on June 8, 2020

Which optimisations does it prevent? I've heard laziness prevents a lot of optimisations, because it makes bottom an instance of every type, and the compiler has to account for that.

Tarean · on June 8, 2020

There are two main parts to this:

- GHC can do less unpacking (moving heap allocations to registers)because of laziness

- laziness as an implementation detail is slow

To implement laziness, GHC puts a closure with it's environment on the heap. When the value is used for the first time, the closure is called. Then the result is put on the heap, the closure is replaced with an indirection to the result, and the program continues.

But this also has some advantages:

- For values that might be used zeor or multiple times, this often actually saves time on average over both strict or call-by-name evaluation

- This is faster than any equivalent code you can write by hand because compiler and runtime system heavily optimize around it

- This gives you mutation through the backdoor which allows asymptotic improvements over strict pure languages in some cases

But the main thing is that laziness allows runtime dead code elimination - if you have an unpacked vector of pairs and only use the left element of each pair, only those values will be allocated - still in a dense bytearray.

ghostwriter · on June 8, 2020

> If I could suggest just one, it would be lazy evaluation by default, which makes it exceedingly hard to reason about time and space complexity

In what scenarios does it make hard to reason about those complexities?

I find GHCs profiler statistics clear and meaningful[1], I lack the efficiency of its reports in other environments I work with, namely Python.

[1] https://downloads.haskell.org/~ghc/latest/docs/html/users_gu...

runeks · on June 8, 2020

> In what scenarios does it make hard to reason about those complexities?

I’m guessing OP is referring to reasoning about time/space usage by reading code, rather than by running the program.

ghostwriter · on June 8, 2020

This is not trivial in strict languages either, because implementation details of a particular language/library dictate the rules. For instance:

    l = list(range(N))
    for _ in range(M):
        x = list(l)

CPython will have MxN iterations (allocations?), whereas other languages/libraries where data is immutable may decide to optimise the list() constructor and return a reference to the same object when the input is an instance of a list.

In lazy-by-default languages I can at least rely on normal evaluation order.

foldr · on June 8, 2020

Of course you need to know the semantics of the programming language you're using. The point is that lazy evaluation introduces additional complexity on top of that.

ghostwriter · on June 8, 2020

the point is that it's not semantics of a language alone, it's the semantics of the language + the libraries + the data constructors + the flow order. Lazy evaluation doesn't introduce additional complexity on top of that for no good reason, it actually trades a few of these complexities for one additional abstraction that lets you think about your programs in terms of data flows and transformations (instead of allocation semantics at every individual step), where program boundaries define the actual computations to perform and the complexity of the final algorithm. It's an extremely powerful tool and the complexity is justified.

foldr · on June 8, 2020

The additional complexity may or may not be justified; the point is that it exists.

ghostwriter · on June 8, 2020

for some reason you specifically ignore that it's not a complexity on top, but rather a trade-in.

foldr · on June 8, 2020

That's irrelevant. OP simply said that lazy evaluation makes it harder to reason about time and space complexity. Everyone agrees with this, for goodness sake - even SPJ himself.

ghostwriter · on June 8, 2020

> Everyone agrees with this, for goodness sake - even SPJ himself

oh, then I definitely should stop thinking on my own about the things that I practice on a daily basis.

jose_zap · on June 8, 2020

It is definitely not harder to reason about time. I don’t think you will find many haskellers agreeing with this.

foldr · on June 9, 2020

Lazy evaluation makes it difficult to figure out when expressions are evaluated, which is relevant to both space and time complexity. To take an trivial example, whether or not your infinite list is (completely) evaluated could be the difference between your program running in constant time or not terminating at all!

Here is a good Stackoverflow response that expands on the point:

https://stackoverflow.com/a/12061524

I've never before seen the claim that lazy evaluation makes it harder to reason about space but not time complexity. It seems a very odd claim on the face of it, since the two are very closely interrelated. (E.g., if you are evaluating some prefix of an infinite list, then space and time are exactly correlated.)

But if that doesn't satisfy you, here is a paper that explains why time analysis is more difficult for lazy evaluation:

https://link.springer.com/content/pdf/10.1007%2F3-540-52592-...

("A major obstacle in the time-analysis of lazy languages is the problem of context sensitivity: the cost of evaluating an expression depends on the context in which it is used.")

jose_zap · on June 9, 2020

Yeah, I understand what you mean, sorry if I created confusion. My point is that the claim that it makes it difficult to reason about is misguided in most cases because there is an easy way to estimate the upper bound of any function if you know the upper bounds of the same algorithms in a strict language.

Generally speaking, the complexity of any function has an upper bound equal to the most complex algorithm used. Laziness can bring the upper bound for complexity down, though. The fact that upper bound may be lower is almost certainly not something that you care about in the overwhelming majority of programs. And I haven't found the first person upset to find out that their algorithm performed much better than estimated.

If you have an algorithm with O(n) complexity, it will be the same both for a strict and non-strict language if the whole result is consumed.

When the whole result is not entirely consumed, then the upper bound for a lazy language may be lower.

The classic example, as you linked to is this:

    minimum = head . sort

If we consume the entire result of the sort function, then the upper bound is O(Nlog(N)). Because of laziness, using head makes the minimum function O(N)

The implementor chose to compose those two functions because laziness makes this possible. If you are in the business of reviewing core code to evaluate their complexity, you are right. It is difficult to nail down get the more accurate upper bound, but that does not mean you cannot guess what a less accurate estimate would be. I would have guessed O(Nlog(N)) the first time for a minimum, and would have been pleasantly surprised that laziness makes that faster.

When has reasoning about time complexity been difficult for you in a lazy language?

foldr · on June 10, 2020

I misspoke in my last comment when I referred to space and time complexity. The usual problem in the field isn't figuring out the big O characteristics, it's unpredictable performance due to evaluation of an expression building up a huge chain of thunks. Because of the inherently non-local logic of lazy evaluation (or, whatever evaluation strategy compatible with non-strict semantics that GHC actually chooses to use), it is not always easy to find which piece of code is introducing unwanted laziness.

That said, it's the same fundamental non-locality that also makes it more difficult to figure out the big O complexity of an algorithm implemented in a non-strict language.

mlthoughts2018 · on June 8, 2020

That example seems deeply contrived. I wouldn’t call that an “optimization” at all, since if you’re returning a reference, then mutating x will mutate the underlying original list, which is precisely not what the user requested by using specifically the list() constructor. If they wanted a reference they can just do x = l. This is not at all any kind of similar critique like the issues with reasoning about lazy evaluation from reading Haskell source.

ghostwriter · on June 8, 2020

that's why I mentioned "other languages/libraries where data is immutable". Replace a constructor for list() with a constructor for graph() from a random library. You won't be able to estimate the complexity of it just by reading the client code, you need to know implementation details.

In Python world, list() may be called to explicitly guarantee an expected shape of a value in runtime. I see it regularly, when various functions call list(iterable) on their input data internally, just to be able to list.append() later on, even though there's itertools.chain() for the same purpose, which is lazy by the way.

mlthoughts2018 · on June 8, 2020

I think you are missing a lot of details about when it is wise vs. unwise to rely on generators and coroutines in Python for lazy evaluation.

Some of the most common mistakes I see Python beginners make are using list or tuple when they could use a generator instead. Rarely it leads to serious memory consumption issues, more often it just leads to messy list-append-copy style code, which is very forgivable.

Some of the most common mistakes I see intermediate Python programmers make involve overuse of generators and lazy evaluation in situations where eager evaluation simplifies the code or is needed for other reasons.

This class of mistakes is a lot worse than the beginners’ mistakes, because you end up baking a reliance on handling generators deep in the code, often in places you don’t want it. People will cut off their nose to spite their own face, e.g. convert a simple list comprehension into a series of chained helper functions that all use yield to behave as generators, not realizing the memory footprint of this can be far, far worse than just materializing the whole list in memory, depending on the situation.

Often in veteran code, you see a lot of calls to list() specifically to remove propagation of bad generators, and ensure a certain function acts as a bottleneck on that type of behavior by design.

ghostwriter · on June 8, 2020

> not realizing the memory footprint of this can be far, far worse than just materializing the whole list in memory, depending on the situation.

you need to reference a particular Python implementation at this point

mlthoughts2018 · on June 8, 2020

Nah, you really don’t. The only notable difference would be in stackless Python, and since for all practical purposes only CPython and PyPy matter for widely discussing Python usage on applied problems, it can be safely omitted.

ghostwriter · on June 8, 2020

You really do. "Generators" are APIs, their implementations may vary, even across different CPython versions, and Stackless is not the only implementation (yet a significant one) that you should consider when talking about the differences in memory footprint. Numba, Cython, Nuitka, and a few other more exotic compilers all have their own implementations of compiled generators, and some of them allocate the required tracking structure quite compactly on the stack [1].

But let's say I agree with you. To make the case demonstrably true, I'd like to see the context where a total volume of generator instantiations becomes less efficient than all possible list/tuple allocations with their contained data. Do you mind sharing your stats? What kind of codebase is it? I'd like to understand the layout of it and to draw my own conclusion on whether the mentioned veteran code is sane.

[1] http://numba.pydata.org/numba-doc/0.20.0/developer/generator...

mlthoughts2018 · on June 9, 2020

Hey, I actually used to work on the numba compiler team at Continuum. I can say I don’t understand the last rant paragraph at all or why you think it’s related to my point about generators.

ghostwriter · on June 9, 2020

> I can say I don’t understand the last rant paragraph at all or why you think it’s related to my point about generators.

This is not a rant, you made a claim about generators' allocations inefficiencies compared to physical list() and tuple() allocations, I'd like to see the case where it's true.

mlthoughts2018 · on June 9, 2020

It is true in many many cases. For a very simple illustration consider

    def f():
        x = list(range(10000))
        yield from x

vs

    list(range(10000))

The former includes the overhead of memory for both the materialized list and function execution frames.

The same thing happens when people naively split up generators that hold onto a lot of data, for example (this comes from real life experience where someone wanted to essentially “memoize with generators” a membership check on the response from a database call).

    def check_expensive_in():
        s = large_db_call()
        while True:
            x = yield
            yield x in s

    def expensive_filter():
        f = check_expensive_in()
        next(f)

        def helper(item):
            v = f.send(item)
            f.next()
            return v

        while True:
            items = yield
            for item in items:
                yield helper(item)
            yield None

    e = expensive_filter()
    next(e)
    e.send(some_list)
    # iterate e until None.

(Sorry for any typos or minor glitches with send(), I am writing this on my phone as I eat breakfast.)

It’s a very simple issue, which is that memoizing with generators maintains the memory footprint of the memoized data _and_ additional memory footprint of the generator (and also of large sent values into the generator too, but this is less common).

It’s better to just materialize the things you need in memory and reuse them in regular function calls, which don’t have fixed permanent overhead for a long lifetime like generators.

This can absolutely happen with small data examples too, where generators are less efficient than just materializing everything, but normally nobody cares because with small data, the effect of any inefficiency won’t be noticed.

Even in that case though, generators often lead to spaghetti code like my example above, because depending on laziness as you compose multiple functions is just a poor conceptual way to organize code. Very rarely, but sometimes, it’s worth it to avoid memory bottlenecks or to do stream processing. But it’s overstated how often this matters generally - it’s very rare unless you’re in a specialized domain where that’s all do.

Lastly I’d like to say that your tone comes across as needlessly antagonistic and it seems extremely obvious you are engaging in bad faith. You don’t seem open to consider what I am saying, rather in a rush to demand some kind of “proof” with no willingness to think through it, and likely not seeking proof to learn anything but just to try to create shallow, undermining retorts.

I won’t be continuing to check back here or engage any further with you. If you want the last word in the thread, take it.

ghostwriter · on June 11, 2020

> You don’t seem open to consider what I am saying, rather in a rush to demand some kind of “proof” with no willingness to think through it

it doesn't work this way, if you state something to be true, the burden of proof lies on your shoulders and I am open to consider the proof, and not just words alone.

It's funny that this subthread was started with a statement that my example of complexity estimation difficulties was deeply contrived, yet the examples for a proof of generators' inefficiences materialize all intermediate values into lists. Why would I want to materialize a list to yield something from it later, if I can just pass the range() stream to the caller instead?

Your second example is an exemplary code smell, as there's no need for a helper() closure. And there's no need to hold resources for check_expensive_in(), as most of DBs have lazy query cursor implementations in one way or the other. Afterall, it's not about doing everything in one expensive DB call and laziness doesn't negate buffered data, it just says that there's an explicit boundary on internal buffers within a single iteration, and every single iteration may perform a separate but less expensive DB call.

logicchains · on June 8, 2020

The -XStrict language extension is a lifesaver here, for when you want a language like Haskell but without lazy evaluation.

whateveracct · on June 8, 2020

Pretty much every time I turn -XStrict on for some module I'm trying to optimize, performance gets worse.

Laziness gets a lot of flak for being hard to reason about. I disagree - it's just different.

But I never hear anyone acknowledge that laziness can bring compositional performance benefits no other language can sniff.

jose_zap · on June 8, 2020

That’s because full laziness is actually an optimization strategy that Haskell uses to make programs faster.

You are completely right about laziness being a big plus for composing functions.

runeks · on June 8, 2020

Except it only lets you define your own strict functions and data structures — i.e. you still need to consider laziness when using existing libraries.

zozbot234 · on June 8, 2020

It's also not a very principled solution. It would be better to use focusing and/or polarized logic so as to allow both "strict" and "lazy" data and functions on a first-class basis. Then language-level options could be used to manage "defaults" as a matter of language syntax, for the programmer's convenience.

weareconvo · on June 8, 2020

FYI: As of the time of this reply, https://dixonary.co.uk/cabal-2020 returns a 404.

johnday · on June 8, 2020

Yeah, sorry, my website got nuked from orbit. Hopefully it'll be back up tomorrow.

n3k5 · on June 8, 2020

Linked page just says “something went wrong” at the moment; cache:

https://web.archive.org/web/20200608072453/https://dixonary....

sradman · on June 8, 2020

Seems to be back with the following added to the top of the article:

> This is a temporary rehosted version while my website is hugged to death.

js8 · on June 8, 2020

I would say all this is improving, and while these are all pain points, they are much less a pain point than they were just couple years ago.

My personal ergonomics improved when I ditched standard prelude for classy-prelude. Regarding Strings, I just use Text type.