What I found interesting is that if you have a pipe/filter architectural style, your happy path stays completely free of error handling, because when a filter doesn't have a good result, it just doesn't pass any data to the next filter in line. Done!
You can then centralize the error handling by having a "stderr"-like output on your filters.
When you have a call/return architectural style, you need to return something, and thread the result along, so you need to do something more complex like this pattern.
This pattern isn't as complex as it looks, it's just the same pattern as you've always used with some mathematical terms added to it. Having a "stderr"-like output on a filter can also be expressed as a Monad. Instead of Either or Maybe, or in general error monads, you could use a writer monad like WriterT. (Haskell, not sure about F#.)
You can then keep your call/return architectural style or use pipes/filters. Haskell and F# are pretty good at providing ways of describing these things mathematically and making them easy to compose, it's just that the underlying math is obscure enough that we get blog posts about it. Fortunately, there are enough mature libraries that you don't need to understand the math to leverage this stuff.
Anyway, the problem is one of call/return in general, whether expressed in an FP language or not. It really wants to return something. With a real dataflow approach, that just goes away. And nothing is simpler than something.
The pattern is to return a value and have some non-local return for errors. For example, having a return value in C++ but throw an exception for errors. By "always used", I'm just asserting that most people are familiar with this pattern and use it frequently.
> Anyway, the problem is one of call/return in general, whether expressed in an FP language or not. It really wants to return something. With a real dataflow approach, that just goes away. And nothing is simpler than something.
This is not true for languages like Haskell. When you are writing monadic functions, it doesn't "want" to return something, instead, you are free to create your own data flow primitives as the monad's interface. This lets you change how control flow works in these monads while providing enough common structure that the novel control flow doesn't doesn't surprise people using the monad. Common ways to do this are by composing monad transformers, returning special value types, or using continuations.
That said, you get more flexibility to define data flow in Haskell if you combine the ordinary monadic parts with data flow libraries like pipes/conduit/machines. But you're still free to not return anything from a function (or technically a monadic action) in Haskell, as long as you're in a monad where returning nothing is a possibility.
Since Haskell is non-strict, it has "dataflow", and deprecates "call/return" to some extent. (Parent poster might not have known that.)
Function "calls" can be passed around but not executed unless needed. Monads like Maybe take advantage of this.
Haskell functions always return something if they are called. Every Haskell program is "main :: IO ()", returning an IO action executed by the runtime, but in the Maybe monad that something can be Nothing.
With monadic structure the non-strictness is not actually necessary. Haskell is a research language with a lot of features dumped into it, the fact that it’s non-strict has forced people to be more aware of whether their code is pure, and purity has motivated the research into things like monads. But you can do the same data flow programming even in an eager version of Haskell, it’s just that in eager languages programmers are more tempted to write impure functions in the first place.
Hmm...using the Maybe monad as counterexample to "call/return languages need a value to represent Nothing, because they need to return something" seems odd at best.
> The pattern is to return a value and have some non-local return for errors. For example, having a return value in C++ but throw an exception for errors. By "always used", I'm just asserting that most people are familiar with this pattern and use it frequently.
> the problem is one of call/return in general, whether expressed in an FP language or not. It really wants to return something
This is certainly a problem with first-order programming, where we might have function signatures like:
foo : A -> B
bar : B -> C
baz : C -> D
And we plumb them together into:
quux : A -> D
quux x = baz (bar (foo x))
IIUC the problem you're describing is that a change causes one of these functions, say `bar`, to gain an error condition. We can represent that by:
bar' : B -> Maybe C
Our `quux` definition would then become:
quux' : A -> Maybe D
quux' x = case bar' (foo x) of
Nothing -> Nothing
Just y -> Just (baz y)
I don't think it's too bad to have a failing function return a special value like `Nothing` to indicate that: `return Nothing` is no more difficult than `throw SomeException`, `exit 1`, etc. The problem is having everything in-between the error and its handler having to propagate it, like above. This sort of 'check if successful, return Nothing if not' (anti-)pattern seems to crop up a lot with e.g. Java programmers being introduced to `Optional`. It seems like we're going to drown in null checks (or Nothing checks, in this case).
Yet this problem isn't so bad if we use higher-order programming (functions of functions, etc.). In particular, we can write the first `quux` as a higher-order combination of those other functions (where `f ∘ g` is composition of `f` and `g`, i.e. a function which applies `g` to its argument, then applies `f` to that):
(f ∘ g) x = f (g x)
quux : A -> D
quux = baz ∘ bar ∘ foo
This ability to compose is really powerful. The FP community (taking their terminology from maths) call these composable things "categories", and functions are one example of a category; another is circuitry, where we can wire the output pins of one circuit to the input pins of another. In the context of this article, a (non-branching) railroad track is an example of a category (we can place one piece of track after another). There's a nice description of categories at http://www.haskellforall.com/2012/08/the-category-design-pat...
Now what if we then need to propagate error conditions? We can just treat that error-propagating ('check if successful, and return Nothing if not') functionality as an alternative form of composition, which I'll write as `⋅`:
(f ⋅ g) x = case g x of
Nothing -> Nothing
Just y -> Just (f y)
quux' : A -> Maybe D
quux' = baz ⋅ bar ∘ foo
In fact we can be a little more general by defining a `map` function instead; this lets us do error-handling even if we only have a single function, and we can re-use the normal ∘ composition:
map f x = case x of
Nothing -> Nothing
Just y -> f y
quux' : A -> Maybe D
quux' = map baz ∘ bar ∘ foo
This looks like the same amount of boilerplate as the first-order version, but operations like `map` are general enough that they'll probably be included in the language by default, or at least with the library that defines `Maybe`, `Just` and `Nothing`. Hence the only code we need to write to go from `quux` to `quux'` is `map`.
Just like with composition, lots of things can be 'mapped'. FP calls such things "functors", although beware of other uses for that term (e.g. some languages use this phrase to mean function objects; some, like Ocaml, use it to refer to module transformers). Another example of a functor is lists, where `map` applies a function to all of a list's elements. In fact, if we think of lists as "containing any number of values" then we can think of `Maybe` as "containing at most one value". Going the other way, if we think of `Maybe` as being "a result which either failed or succeeded" then we can think of lists as being "a result which either failed or succeeded in any number of ways", i.e. lists are just non-deterministic computations (e.g. see Wadler's "How to Replace Failure by a List of Successes" https://rkrishnan.org/files/wadler-1985.pdf )
Another example of a functor is functions: a function `g : A -> B` can be thought of as 'containing' some `B` return values, which we can 'access' by providing an `A` value as input. If we have a function `f : B -> C` we can apply it to those return values by doing `map f g : A -> C`. We get a new function 'containing' `C` return values, which we can 'access' by providing an `A` value. How does such a function work? It just takes the `A` value that we give it, passes it into `g` to get a `B` return value, then passes that into `f` to get the final `C` return value. In other words, the `map` function for functions is just the `∘` composition operation!
Another example of a functor is an I/O action, for example reading a temperature sensor. I/O actions are like functions, in that they 'contain' return values (e.g. temperature readings), which can only be 'accessed' by executing the action (e.g. performing a measurement). Just like functions, we can `map` over these actions as a form of composition: if we have `a : IO Temperature` and `f : Temperature -> Either Hot Cold`, then we can do a form of composition using `map`, where `map f a : IO (Either Hot Cold)` is an action which can measure whether it's hot or cold (by executing `a` then feeding the result into `f`).
Another thing we might want to do is 'unwrap' nested values. For example if we have a possibly-failing function `g : A -> Maybe B` and a possibly-failing function `f : B -> Maybe C` then we can use `map f g : Maybe (Maybe C)`, which might be `Nothing` (if `g` failed), or it might be `Just Nothing` (if `g` succeeded but `f` failed) or it might be `Just (Just something)` (if both succeeded). If we don't care about which one failed, we can use a function called `join : Maybe (Maybe A) -> Maybe A` to replace `Just Nothing` with `Nothing` and `Just (Just something)` with `Just something`.
FP people call joinable things "monads", and in the case of `Maybe` we can think of it as concatenating together a list-of-at-most-one-(lists-of-at-most-one-element) to get a list-of-at-most-one-element; or as turning a combination of error cases into a single error case. For lists, we can think of `join` as concatenating lists-of-lists; or as turning chains-of-successful-results into successful-results. For functions we can think of `join` as making a function which re-uses its argument for the subsequent step. For I/O we can think of `join` as turning an action-which-measures-which-action-to-take-next into an action-which-measures-then-does-something.
>I don't think it's too bad to have a failing function return a special value like `Nothing` to indicate that: `return Nothing` is no more difficult than `throw SomeException`, `exit 1`, etc.
Yes, it is a less bad alternative than many others, such as explicit error codes w/o wrapping or exceptions. (And all of these are obviously arguable...)
However, just not having to do anything is even better. And if you use a pipe/filter pattern (not just simulate it by threading functions along), then the current filter sends the result of its processing to the next filter. For example, assuming we have a filter object whose next filter is in the target instance variable:
target writeObject: inputString upperCaseString.
If it has nothing to send, it simply doesn't send anything. Done.
No need to return a Maybe that denotes "Maybe I have a value, maybe not". If you don't have a value, just don't send anything, and your happy path never has to deal with error cases.
Similar for multiple values. Just write multiple values to the next filter. No need to return collections and flatmap them.
The real "revelation" of this method of error handling is how it uses types to explicitly state how errors are handled. Having a return type of `Either String Int` let's the user know that this function does in fact have an error case that would potentially be short circuited to a far off error handlers, and that a `String` is the type returned on an error.
In most other languages you either don't know if an exception could be thrown or you just assume every function throws one. Then when you are adding in things like promises which explicitly define success and failure paths the system must implicitly catch all exceptions and pass them to your error path...unless the developer chooses to change that functionality.
Much nicer to have all that defined for you in the type system, even though it existed without it.
It doesn't have to be that different with exceptions, which were invented for that very reason. When used properly, you catch those at the highest appropriate level by letting them "bubble up". What is different though, is that exceptions are much more implicit. In most programming environments, it isn't clear which ones might occur. Java has checked exceptions to counter that, but according to many it's a cure worse than the disease. In that sense, "railway-oriented programming" supposedly offers the best of both worlds.
Personally, I'm not entirely convinced, though. In practice, exceptions will still occur unless the language doesn't have them, so now you have two mechanisms to deal with errors. (.NET has a strong culture of exceptions—no thank you to lots of wrappers to make established libraries use result types.) Also, it doesn't seem realistic to anticipate every possible error scenario. Sometimes all you know is that something went wrong. That's fine. Show a generic error and improve the software once the cause has been identified. Still, I would like there to be a better way to find out about the most common exceptions than reading documentation.
When using either in that kind of language, you basically want to catch exceptions at the bind level and put the exception result in the left hand side. The main advantage of this is that you are guaranteed not to jump stack levels higher than the call to bind (or whatever he's calling it in his RR architecture -- I haven't read TFA lately).
When using Either in that way, I tend to think of it as a structured exception. It gives you a defined exit point, just like in other structured programming. I've tried to introduce this on our team and probably there is about 50/50 acceptance. Half of the people think it is awesome, half of the people can't see the point, so I think it's probably an acquired taste.
This feels like a lot of words and fp "stuff" to describe a system where a function takes two inputs (anyval0, maybe<error>) and ouputs (anyval1, maybe<error>) and you chain them together.
The essence of FP is that if you apply a bit of cleverness and engineering, you can replace things that look like they would need custom language features, like multiple return or async, with libraries that let you work do the same thing with plain old functions and values.
(All that said, if you're thinking in terms of multiple inputs and outputs you're missing the point. The output isn't a pair, it's one or the other, and that's really important. You need to be able to make invalid states unrepresentable, so a disjoint union is not just a pair)
"Building a backtracking facility in smalltalk without kernel support"[1]
Languages like Snobol, Prolog, and Icon were designed with backtracking facilities from the outset and these facilities are deeply intertwined with the implementation. Retrofitting a backtracking facility in a language that wasn't designed for it has never been achieved. We report on an experiment to retrofit Smalltalk with a backtracking facility. The facility is provided through a small number of primitives written in the language (no modifications to the kernel were made). The ability to do this is a direct result of the power provided by the objectification of contexts.
There's tons of other examples, for example various distributed objects mechanisms, my own Higher Order Messaging [2] the way Gemstone creates queries from procedural selection code etc.
The trouble with OOP is that objects you can send messages to are already too complex to reason about. You can't send the same message twice and expect to get the same reply (or even, in some implementations, get a reply at all). So the fact that you can implement anything with objects isn't that surprising, because saying something is an object doesn't really constrain what it can do - you might as well say you can implement anything with a "thing", which is true but useless. Functions - or rather, constructive, well-typed functions - occupy a nice sweet spot where they're constrained enough to reason about, but powerful enough to do what you need.
https://tahoe-lafs.org/~davidsarah/noether-friam4.pdf has a more concrete set of tradeoffs: starting from a small, highly symmetric core, you add more capabilities to the language but you break a symmetry each time. One of the levels in the rainbow is an OOP language, but that requires breaking enough symmetries that I wouldn't want to work in it if I could avoid it.
Yep. Java’s strawman implementation of OOP lead most of the world to never learn OOP’s potential. In Smalltalk, if was a virtual method of Boolean! Sounds horrible for performance, eh? Despite that, the Strongtalk spin off of Smalltalk was used as the internals of Java’s Hotspot JIT.
From The Evolution of the Smalltalk Virtual Machine (Dan Ingalls, in Smalltalk-80, Bits of History, Words of Advice, Krasner):
"Smalltalk-76 had to provide a mechanism for passing unevaluated code. [Blocks/Closures] Since the Smalltalk-76 design had no experience to draw from, it was weak in several areas.[..]
One Problem which was discovered in the process of of supporting error recovery was that block contexts could not be restarted because they did not include their initial PC as part of their state. This was normally needed for looping , since all such code fragments ended with a branch back to the beginning. Happily, we were able to fix this by defining a new subclass."
So they were able to fix a bug in the language implementation by writing code in the language itself!
Totally agree. Most problems with Java are cultural.
It seems the more experince people get with large team efforts the more they value Java and other well supported type safe languages.
I'm a perfect example myself as a person who used to think that a number of other languages would be better but after having to maintain my own (yep, not only someone elses bad code, but also my own precious code) in a number of languages I've come to really prefer Java.
Can you point to some resources that teach Java "the right way"? Without the cultural baggage? I need to learn it for a new job and I'm inundated with resources that I can't determine the quality of.
JVM dev here... unfortunately there aren't any great holistic books for modern Java that I can think of. A lot of it is just community learned (e.g. put the XML away, stop making everything a damn GoF design pattern, stop trying to create mega class hierarchies, favor small libraries over mega frameworks).
I will say Joshua Bloch's Effective Java is something ever Java dev should have read at some point.
It's extremely frustrating that whenever anyone talks about even the slightest positive mention of FP, this site tends to immediately have 2-3 folks pop out of nowhere demanding that everyone recognize OO exists.
Yes. Yes. We know. You can write programs with OO. It's the dominant paradigm of the software industry. No one here saidthat OO is somehow ineffective, cannot abstract, or lacks cleverness.
Please for even comment section just put down the sword and let other folks with equally interesting approaches talk about their work.
I feel like the HN bias tends to go the other way, that FP is some kind of awesome magic sauce and OOP is the bane of good software. In reality, most of the techniques we discuss don’t really depend on a specific language orientation.
A good number of the posters here are fans of Paul Graham who is a loud FP advocate.
Also I've been here for a number of years and I can never recall seeing anyone bashing FP. (Which is totally fine with me.)
Bashing Java and and general OO however is to be expected in any programming related thread here (a little tiring but understandable, I too have seen my share of bad OO code.)
Neither of this counts as proof but yes, this site comes off as extremely pro FP for me as well.
Paul Graham is not a fan of FP, he's a Lisp advocate. Every Lisper will be quick to tell you their approach is a superset of FP with its own unique meta-syntactic tools.
This is a great illustration of my point. If you place every differing viewpoint in one bucket and then take the sum, yeah it's gonna see overwhelming. In actuality, there are lots of factions with interesting viewpoints.
Literally every OO programmer is on the hook for knowing a huge array of smalltalk-derived patterns. MANY programmers use languages with significant inheritance from Smalltalk.
Common Lisp also isn't coming back despite having some amazing features. Let's both dry our tears and move on.
Mentioning something great that smalltalk is capable of isn't the same as saying everyone should use smalltalk. It's more saying - let's not forget some of the great stuff that existed in the past that the current crop of popular languages don't have.
But no one's going to forget. Every time I mention FP, I have to be on guard for some OO stan showing up demanding that I acknowledge and "not forget" that OO has in fact "done things".
Even this example where someone says something totally unrealted to OO, someone felt compelled to leap in and say something as unrelated as, "OO too! Here's a barely related OO project I found to be clever! Please don't forget about us."
Let me just remind you how this went:
Person A:
> The essence of FP is that if you apply a bit of cleverness and engineering, you can replace things that look like they would need custom language features
Person B:
> Same for OOP, example from 1988:
No one even implied that this was exclusive! But certainly other folks have used the word "defense" for that post, perhaps not even realizing it. As if the existence of other paradigms somehow threatens the importance of OO.
Maybe it's because every time people talk about FP, someone has to dredge up an OO (read: enterprise Java) strawman showing all of its alleged deficiencies?
No one did that here. There was no mention of OO. There was an allusion to how typed FP relies on extremely generic programming, which is a notable difference from many other programming approaches.
> which is a notable difference from many other programming approaches
Except that it isn't "a notable difference", more precisely (quoting the original):
"replace things that look like they would need custom language features, like multiple return or async, with libraries"
Replacing "things that look like they would need custom language features" with libraries is exactly the essence of Smalltalk.
"Talk small and carry a big class library"
It's even in the Design Principles Behind Smalltalk:
"A language must provide a means for classifying similar objects, and for adding new classes of objects on equal footing with the kernel classes of the system."
So either there is a claim that "hey this is a cool way of doing things", in which case "yeah, I agree, this other thing does that as well" is completely benign.
Or there is the claim "here is uniquely FP way of doing things", in which case the example of OO is a refutation.
> All that said, if you're thinking in terms of multiple inputs and outputs you're missing the point. The output isn't a pair, it's one or the other, and that's really important
Only in the example showed in the website, if you want to have progress reports/feedback on your happy path and keep it even if you go to the error path then having a multiple result/structure containing both results at the same time make sense.
Sure, and there are structures you can use for that (e.g. Writer). But that's kind of the easy part: every serious programming language allows some kind of pair structure, but a lot of languages still don't have a good "A or B but not both" structure.
Some people don't understand what it means to chain functions returning maybe types together, so they need more words.
Also, We're not taking about maybe types here. This article is demonstrating a specific library that composes a specific either type using F#'s computation expressions or custom operators.
Lots of people don't know what computation expressions are, nor are they familiar with the bind operator. They don't know why they'd use an either type instead of try/catch.
You can always reduce the number of words to say something, but you often sacrifice context and end up alienating people when you take too much for granted.
No, it's a system that takes 1 input (anyval0) and outputs either an error OR anyval1. If any of the functions errors, it rides the "error" track back up and returns the first error that it encounters. This is fail fast method using flatmap.
The part you might be missing is that you're explicitly told "this function returns a value OR and error. You must handle error cases with this function". With the try/catch paradigm you never know from the function definition if that is the case.
You also know what type the error will be. No need to derive a new class from `Error` and add custom functions to extract the data from your failure case, just pass the data along.
What's convoluted about it? Exceptions jumping up multiple stacks and being rescued or reraised or what have you is convoluted. If you mean errors as return values then I agree, monads are more complicated than returning a simple error type.
It's worth noting also that the benefit of doing this is to avoid the difficult-to-reason-about, goto-like behavior that exception throwing and catching necessitates. I have yet to experience a team which handles that very well at all; it usually turns into a mess.
I feel like this statement is often used and just as often wrong.
It's not about avoiding the goto like behavior. It's about making you to know it exists and forcing you to handle it.
Rathet than throwing 5 calls deep, doing nothing until you empty your call stack, you must explicitly say "I know I'm part of a system that may error" at each step along the way.
In my experience (most) exceptions also cause people to forget about their own failure state, i.e. "always on the “happy path”", so they leave a mess behind when something they call throws an exception. Often they don't haven't handled failure state even when their own code throws an exception.
As much as people like to hate Java's checked exceptions, this is one thing I feel they really help with. They force you to think about the failure cases at least enough to either declare that you percolate them up, or handle them.
I've been writing a lot of Kotlin recently, and Kotlin doesn't have checked exceptions. This has led me to start migrating away from exceptions entirely because without checked exceptions it gets too hard to track what can fail, and how. I've been switching to an approach like the author's of using an `Either` (actually `Validated` from arrow-kt -- a special purpose `Either`) to have failures in my return values. I just have to be careful when calling library code to immediately translate exceptions into return values, but the entire stack in my own code has no exceptions (aside from unrecoverable errors).
That said, it's often preferred to use Either<Success,Error> because then you can try to explain what went wrong. A Maybe cascade failing can be pretty mysterious when you start to compose them, and you're deliberately creating something your optimizer will inline into an difficult-to-debug straight line of code.
Nice to see somebody else use a railway analogy. I like to use railways to describe why interfaces and encapsulation are important engineering principles. The track gauge is the interface between the people that build the railway and the people that build the trains. As long as they both implement the interface then together they get something much greater than the sum of its parts.
I either haven't gotten the point or have to assume there isn't one. Isn't that normal error handling? The incoming error path is usually even done with exception handling in normal languages, so you don't have to worry about it. the pattern `if not <assertion>: raise Exception` should be everything one needs. And for that this whole stuff is really fancy and long, especially if one considers that most people (sadly) don't even check assertions between function calls, if they are not forced to by bug reports.
On the same train of thought, there was a book called "Plug and Play Programming: An Object-Oriented Construction Kit by William Wong" that detailed a similar concept in C++. Its an old, hard to get book, but did a nice job of it.
public abstract class Maybe<T>
{
public abstract T Value { get; }
public virtual string Failure => null;
public SelectMany<U, V>(Func<T, Maybe<U>> f, Func<T, U, V> g)
{
var that = f(this.Value);
if(that.Failure != null)
return new Failed(that.Failure);
return new Success(g(this.Value, that.Value));
}
public sealed class Success : Maybe<T>
{
public override Value { get; }
public Success(T value) => this.Value = value;
}
public sealed class Failed : Maybe<T>
{
public override Failure { get; }
public override Value => throw new InvalidOperationException();
public Failed(string s) => this.Failure = s;
}
}
Maybe<string> UpdateCustomerWithErrorHandling()
{
return
from req in receiveRequest();
from v in validateRequest(req)
from c in canicalizeEmail(req)
from d in db.updateDbFromRequest(req)
from e in smptServer.sendEmail(req.Email)
select "OK";
}
There's a difference. First, you're not incurring a large runtime penalty, so this works better in higher performance scenarios.
Second, you're trapping and propagating successful values in a monad, which means your obligating the caller to handle both success and error conditions.
> First, you're not incurring a large runtime penalty
How large, and why? Only when exceptions are thrown, or also during the happy path?
From what I understand, with Monads the code check errors after each expression (like in C/Go), but behind the scene. Isn't it the case?
If so, wouldn't that incur an overhead that is not present with exceptions, where normal code runs without checks and where "throws" perform a sophisticated jump?
With exceptions you have overhead on the happy path and potentially even in code that can never error: you have to keep all your old stack frames around in case you throw, since you never know when you might throw. Doing nice tail calls, or continuations, gets very hard since you have to figure out how your exception handling interacts with them. (Of course F# has to pay a lot of those costs already, since it runs on a VM that was designed for a language with exceptions). Whereas with result types everything is just functions and values and you don't need any special cases.
I consider myself a skilled programmer that is aware enough of low level issues to usually write faster alternatives to any "not quite optimal" code on the first try, without resorting to any micro-optimization. That includes rewriting code that makes use of libraries like STL of which most people will just say "dude, you are insane, you can't be faster than the STL. They put so many man-years of effort into that".
But it rarely matters how much work you put into something to make it faster. What matters is what you can avoid to do at all.
I have pretty much zero knowledge of CPU-level issues and barely know enough x86 assembler to read my compiler's output. I have never cared for optimizations like inlining code, vectorization or tail call elimination. I have tried inlining a few times, and the speed-ups it brought were negligible. (And the slow-downs that too much of inlining brings, by increasing code size and thereby increasing cache pressure, must be very subtle and basically unmeasurable, while being potentially enormous. And the disadvantages from a maintainability standpoint are considerable as well).
If you have to care about these things, then maybe you're doing too much work. (Well, I guess I would use vectorization/SIMD if I had to write a video codec, but I've never done that).
Avoiding Exceptions in .NET isn't some micro-optimization that makes a tiny difference in runtime speed...
Exceptions are pricey pricey things that take time and create some minorly interesting VM/GC behaviour that can impact your application. If at all possible they should be avoided in executed code.
> But it rarely matters how much work you put into something to make it faster. What matters is what you can avoid to do at all.
Using Result<> types to handle error cases is a very good example of doing just that: avoid the big hit of exceptions and poor exception handling in cacadingg situations with sensible return types that let you avoid a buttload of delicate work.
> Exceptions are pricey pricey things that take time and create some minorly interesting VM/GC behaviour that can impact your application.
This is all very vague and handwavey. Not sure about the .NET world but in other systems the costs of exceptions is not significant and they can be used successfully even in the most demanding low-latency applications.
> Using Result<> types to handle error cases is a very good example of doing just tha
Result types don't scale. I suppose it's the sort of thing that programmers have to keep rediscovering for themselves but there's a reason why even Haskell eventually "discovered" exceptions (and I suspect the Rust guys will get there eventually). In the real world where you have lots of deeply nested function calls some of which cross component boundaries good error handling requires three things: contracts (you need to be able to express possible errors at boundaries), context (you need to know exactly what resources have been allocated so they can be cleaned up when an error is caught) and recovery (you need to be able to return the application to well-defined state/point so computation can continue). Frankly, Result types don't give you any of these. What you end up with is a very generic set of errors (necessary to avoid a combinatorial explosion of error types) and a poor-man's implementation of exceptions (see panic/recover in golang). Or I suppose you can write code that doesn't involve deeply nested call-graphs and multiple components and everything is perfectly "flat" and stateless... but this wouldn't be like any real application I've ever seen.
Even so, exceptions are generally not used for error handling in Haskell, yet life seems to go on. You'll see code using `throwError` or `catchError`, but usually these will just be providing a nice bit of sugar on top of an Either or Maybe type. So it seems like the takeaway should be that Result types do scale, with proper library+language support.
We sorta use exceptions for IO-based code, mostly because we love parallel combinators in async and they totally require async exceptions to work right.
What I was trying to say somewhere between the lines: Try structuring your code so you don't run into errors in the first place. Because if error handling performance matters, you're having too many of them.
This is always good advice. One should strive so their code doesn't run into errors in the first place.
However, sometimes the errors are actually part of the fabric of the domain itself. They're expected to occur very frequently. The data it encodes is valuable. The way they're handled is important.
Nevertheless, you're right. More often than not, you're better of using traditional exception handling and preventing errors from occurring in the first place.
Oh, I agree. I'd use results rather than exceptions even if they were slower (and in some implementations they are), because they make the code so much easier to understand and maintain and that's more important the overwhelming majority of the time. But for what it's worth, correctly implemented they can also be faster. And I find that's generally the case: a clear, simple implementation that expresses the essence of the problem will be faster than any number of carefully-tuned edge cases.
> Only when exceptions are thrown, or also during the happy path?
This is where I want to make a key distinction between exceptions and errors ala Railway programming. Performance with exceptions doesn't really matter because exceptions are supposed to be exceptional.
With Railway oriented programmer, the whole point is that you're encoding errors into the logic of some type of system. These types of errors may not only be not exceptional, they may a very regular part of the application.
Like I told another commenter, a friend of mine uses them in an application where they easily would throw thousands of exceptions per second if they used exceptions. Not only would it undermine the performance of their system, it's using the wrong tool for the job.
The performance criticism is a common one, but I don't think most people actually know what they're talking about. I certainly don't, since I don't use exceptions, but I've heard many respected engineers say that they needn't be slow (it depends greatly on implementation).
It shouldn't matter much anyway: You usually don't optimize for exceptional cases, especially if they lead to program termination.
And furthermore, you should not make heavy use of exceptions anyway :-). They lead to non-local control flow that is extremely hard to follow. They are nice for short scripts where they print a stack trace and abort the program. But they are a huge maintenance burden in larger programs.
You are correct that the performance of exceptions isn't typically needed because exceptions aren't meant to be thrown that often. They're exceptional.
Railway oriented programming is better suited when errors are much more common in the ringtone and when their handling is a fundamental part of the rules you're encoding.
One friend uses a Railway oriented approach precisely because throwing that many exceptions per second (I don't know the exact number) would severely undermine the performance of their system.
So I'm not talking about exceptions, I'm talking about talking about errors. You can easily generate thousands of errors per second if you're simply validating and routing large streams of data.
It's nice to get concrete and leave the abstract shit talk (of which I've been guilty many times).
So let's focus on that problem you mention and not argue errors vs exceptions terminology. In the case you mention I would clearly make two streams of data. In the simplest case, two append-only dynamically allocated arrays, one for parsing errors and one for parsed results. This is very likely possible to do - unless one needs an exact ordering, in which case it could be still possible to add positions to each of the streams.
It's intuitively clear to me that this approach will be significantly faster than a Result<>/Either based approach. Because control flow is much simpler. There are two consumers, one for the error stream, and one for the parsed stream. There is no switching between the cases on the consuming side, and complexity is much reduced because each part can do one thing and do it well.
You also get the freedom to process the streams independently, e.g. at different points in time, or not at all, etc. It's much simpler - if you try this route once, you will ask yourself afterwards, how have you ever endured this complex algebraic data types stuff.
Yes, you could still add a consumer of those Result<> values that demultiplexes them in two. But nobody does that. Because, why did we introduce Result values in the first place then?
This is just one of the many examples I encounter daily and that demonstrate how advanced type systems and ideas like OOP or FP encourage bad program structure. They are just not good ideas.
This is not demultiplexing into two different data streams, but handling both clumped together. Exactly what I was speaking of: Avoiding this clumping and processing the cases as separate streams (and independently!) brings huge payoffs.
Just try it; you will realize how much simpler your application gets - so many problems simply go away if you make more homogeneous data streams, and not fewer heterogeneous ones. And so many possibilities magically open up.
Very briefly, here is the difference:
def process_both_cases(stream):
for item in stream:
if case1:
do_complex_stuff_1(item)
elif case2:
do_complex_stuff_2(item)
else:
assert(False)
vs
def process_case1(stream):
for item in stream:
do_complex_stuff_1(item)
def process_case2(stream):
for item in stream:
do_complex_stuff_2(item)
I think it's immediately clear how much better the second version is. It's less complicated: The conditionals are not needed. Processing the cases is decoupled.
You... do realize that it's quite easy to make the former out of the later, and that's how every FP parser combinator library works.
Using a combinator to combine two undecorated funtions into one, you don't need to result to two streams, you can significantly simplify your runtime by only needing one. It's been written for you: http://hackage.haskell.org/package/base-4.11.1.0/docs/Contro...
Please consider that the incredible strength of ADTs is that you can just write generic code to combine those things without resorting to the crabtastic pattern matches that you're proposing, AND avoid the ugly implicitly unpredictably ordered world that you're describing.
Heck, we don't even need to resort to combinators for this. It's just you making up a complex case by saying, "What if I hide half of example 2 by handwaving it?" You still are going to have a case1/case2 decomposition in your second version. It's universally worse.
I don't see a pattern match anywhere in my example. And it's not in the slightest "unpredictably ordered" and has much simpler control flow.
> You still are going to have a case1/case2 decomposition in your second version. It's universally worse.
No. There is simply none. The producer makes them as two, and the two are processed independently. There is no problem if you don't make one. If you had ever tried the approach I propose out you'd know the flexibility that comes with it. (I'm currently working on language tooling, and there it is absolutely the case that it's very easy do without ASTs, and process the cases independently).
There, "forks" does contain an ADT of pointers into s1 and s2. But it contains nothing more, so it's very minimalistic. The advantage from the first case is still there: Most code profits from the homogeneity of s1 and s2, and from the actual, physical decoupling of the cases, and so much pattern matching just goes away.
(Btw. I'm not interested in fancy FP constructs. You will not convince me that you can process ADTs without pattern matches. You might be able to avoid writing them down in the source code, but I care about what actually happens on the hardware.)
(And actually I think avoiding to write down the pattern match in the source code and only generating it magically with higher level constructs is a bad, bad idea. It has poor readability. And it encourages the mindset "let's just do the worst shit on the actual machine. We can avoid writing it in the source code, and we don't care what happens on the machine and how slow it is. And we're too lazy to think about actual solutions that actually avoid real work".)
> Where I need to relate the individual items of s1 and s2 in a common ordering, I've had great success with this approach:
Sorry I misunderstood your use of the phrase "two streams." In my world multiple streams effectively severs causality between the streams. Or did I? You wrote:
def process_case1(stream):
for item in stream:
do_complex_stuff_1(item)
def process_case2(stream):
for item in stream:
do_complex_stuff_2(item)
If this genuinely relates to your example above then the only possible consistent ordering without any async/multithreading is to exhaust stream1, then process your errors. If you don't do that and the implication is that the streams won't close on errors but rather continue to emit, then you really can't guarantee any ordering to the processing.
Given this inconsistency, I'm going to continue the post with your intent of causality being preserved and assuming assuming that process_data() offers the next value (or null) and we're calling these in a loop. Because otherwise you have a whole heap of problems and your example doesn't even faintly resemble any subject matter from the post, or what most folks were talking about here. And if you believe that it doesn't destroy causality, you're either writing a lot of code to recombine your streams in lockstep (in which case, splitting is quite the extreme step to avoid matching) or you just don't understand the implications of stream semantics in this case.
> There is no pattern match there, sorry to disappoint.
However you provided a destructuring pattern match in your code!
s1, s2 = process_data()
Your process data is pattern matched on a generic product there. Sorry to disappoint. The actual type of the value is:
(Either Null S1Type, Either Null S2Type)
(And I assume it must be an `Either Null a` because we're discussing a case where either case1 or case2 will exist, so there's no valid value for one when the other exists.)
This is sort of why folks who know about ADTs roll their eyes at the sizzling invective about their complexity. You often end up using the same solutions, you just don't name them. It seems like the naming itself is objectionable. You've written about 4 posts I can see here about how much you hate ADTs, but then you're happy for an implicit `Either Null a` on every value, no opting out, and `a,b = f(x)` is fine because the name of the ADT is a punctuation mark, I guess?
As for the first example, Pattern Matches as Case statements are related in that a pattern match is a hyperactive case statement. So when you have one, you have the other to a varying degree.
> (Btw. I'm not interested in fancy FP constructs. You will not convince me that you can process ADTs without pattern matches. You might be able to avoid writing them down in the source code, but I care about what actually happens on the hardware.)
You seem to be making a huge deal out of pointer chasing. However, you're also suggesting a detour through the function contexts of process_case1 and process_case2 just to offer a null in one case.
Your plan seems a bit unfair, while we're on the subject. You're arguing that a pattern match is "heavy overhead", but you're completely ignoring the actual use case of "railway programming": to make a single clean codepath obvious and localize error handling at the end. But the thing you've offered doesn't really do that.
You're processing S1 and S2 (which I'll rename E1 for error), but what happens when I want to compose it? Extending the metaphor, your new code might look like:
s1, e1 = process_data()
process_ecase(e1)
// If you don't put an if( s1 != null ) here you're going
// to have to do a ton of function calls AND guarantee
// your handlers check their args for null.
if( s1 != null ) {
s2, e2 = process_case(s1)
process_ecase(e2)
if( s2 != null ) {
// ...
}
}
You could perhaps change the ordering of these ops (or push the null checks into your code which has a bunch of problems). And most of all, your compiler is totally powerless to help you work out if you have an error or not. You might also instead try to embed the dispatch to the next step in the def blocks of each individual step (an approach that could never work with the stream example, as you need to define your stream network (or at least a global namespace for it) before streaming events through it.
> And it encourages the mindset "let's just do the worst shit on the actual machine.
Okay, your example extended above. The Haskell-style of this combinator pattern (which DOES require all errors have the same base type, let me be transparent about that) is:
-- Assuming the steps are: a -> Either Error b
let pipeline = process_data >>= step2 >>= step3
pipeline () `catchError` errorHandler
The errorHandler is just a normal function taking your shared error type. Each step is just a function that returns an Either Error or its value. The values can vary so long as the types between steps line up (and the compiler checks this in F#, ML and Haskell).
It's easy to read, too. It's easy to extend, and it's predictable, and it's made from very simple decorated functions. We can even safely make it interact with exceptions, if we need to, without overly complicating the callers.
> And it encourages the mindset "let's just do the worst shit on the actual machine. We can avoid writing it in the source code
What's completely confusing to me about your claim here is that you're doing dynamic dispatches and perhaps excess function calls, demanding your process functions do null checks, you have no clear looping construct, and your compiler has 0 opportunity for any sort of inlining but the most naive, cache busting type. But you're essentially dunking on what becomes BNE statements at the machine code level. At runtime, all the types are erased in the best case or preserved in exactly the same way as any normal object with a dispatch table would. The actual cost is, in most cases, indistinguishable from your if-statement rails.
I actually think you like lots of elements of the FP style, you just seem to dislike the names. Perhaps if you could hold your nose with a clothespin and play through the stink of math-ish names, you might find more you like?
I think you're assuming I'm less experienced than I actually am!
> If this genuinely relates to your example above then the only possible consistent ordering without any async/multithreading is to exhaust stream1, then process your errors. [...]
Unless we know there are dependencies between the processing of s1 and s2 (and we don't; we're talking very abstract here) I'm totally unconcerned about the processing order, and I even stated explicitly that in the simplest case they're just append-only arrays. And the processing of the items in s1 and s2 can be done in any order, but why not just do it in a single thread, first process "data" completely, then "s1", then "s2". In many cases of course, it will be possible to run in 3 parallel threads, where the additional two process s1 and s2 as they are emitted from the first thread. But that's just a small optimization and a lot of complexity and not relevant here.
> Your process data is pattern matched on a generic product there. Sorry to disappoint. The actual type of the value is:
You're purposefully twisting words here. And no, that's not "the type", because I was (obviously) just using straight forward pseudocode whose semantics I assumed would be intuitively clear to everyone. That's not the kind of pattern matching we'd been talking about. In my code s1 and s2 are simply two known variables holding the streams. That's something different than a single variable holding either of two distinct cases. I assume you know that and just want to sell the FP cool-aid or troll me; please refrain from this style of discussion.
> This is sort of why folks who know about ADTs roll their eyes at the sizzling invective about their complexity. You often end up using the same solutions, you just don't name them. It seems like the naming itself is objectionable. You've written about 4 posts I can see here about how much you hate ADTs, but then you're happy for an implicit `Either Null a` on every value, no opting out, and `a,b = f(x)` is fine because the name of the ADT is a punctuation mark, I guess?
There is so much hair-pulling in here and what you say is simply wrong. I was just returning two friggin' variables. And why are you talking about "Null"? That's of no interest. There is no use in my pseudocode for any kind of thing resembling what's called "Null" in any programming language. As a Haskell enthusiast you should know that there is always a canonical stream/array, and that's the empty one. Why are you haunting me with Nulls, that's not fair, I don't want them, and I can do without them. (And just to be clear, I don't do OOP either; I think it's at least as bad as FP)
> What's completely confusing to me about your claim here is that you're doing dynamic dispatches and perhaps excess function calls,
No I'm frigging not. I don't get it why you're talking of dynamic dispatches now. What I have given is some pseudo code of the simplest kind, that easily translates to any procedural programming language, including assembly code. Why do you make it seem like it was something more complicated? You must be deliberately hair-pulling just to win a stupid argument. I'm not doing any "virtual object-oriented blah blah" or any kind of types craziness.
s1 is a stream (or let's just say (again), an "array", to hopefully prevent any misunderstandings) of homogeneous things. Let's just say all items in s1 have type S1 and all items of s2 have type S2. It's the simplest kind of problem.
> -- Assuming the steps are: [..]
Congrats. I learned Haskell once, too. Until at some point it became clear that it didn't bring anything practical to the table and only complicates program designs. This "uniform error handling" idea (actually it seems to be more of a necessity/accident) just makes for bad code. Don't do that. Just do the first processing step, handle errors accordingly. Then do the second step, and handle errors accordingly. The second step in general requires different error handling than the first, (and often the only thing they have in common is that they might "die"), and there is no need and only disadvantages in restricting them to have uniform error handling.
What's wrong with my procedural pseudo code? Fill in the required error handling between the steps as soon as you have a concrete instance for my abstract example.
> dynamic dispatches
Nope
> excess function calls
What's that?
> demanding your process functions do null checks
No, I don't have any "Nulls" of any kind, since my data is simple and normalized.
> no clear looping construct
I used a pseudo-code for loop. Every 1-month beginner to programming has an immediate grasp of that. I never heard anyone complain that was somehow not "clear". It's plain, procedural code. You're trolling me so hard it hurts.
> I actually think you like lots of elements of the FP style, you just seem to dislike the names. Perhaps if you could hold your nose with a clothespin and play through the stink of math-ish names, you might find more you like?
I've been there and I know to use FP-ish constructs when they make sense (very rarely. Much easier to restrict to procedural altogether).
> Unless we know there are data dependencies between s1 and s2 (and we don't; we're talking very abstract here)
The entire discussion is about railway programming for early exit on error handling. So yeah, there are dependencies there.
> You're purposefully twisting words here.
No. I'm pointing out you pattern matched against a generic 2-tuple in your pseudocode. There's no words to twist, returns in functions on modern hardware are atomic over 1 value in all but the most obscure languages, and it's not a new phenomenon.
If it's valid to say, "I returned 2 values" it's equally valid to say, "I returned one value and restructured it." They represent the same thing.
> And no, that's not "the type", because I was (obviously) just using straight forward pseudocode whose semantics I assumed would be intuitively clear to everyone.
Both statements can be true. I'm simply writing a type that captures your "obviously true" statements. Therefore, both are obviously true.
> There is so much hair-pulling in here and what you say is simply wrong. I was just returning two friggin' variables.
You're the one consistently lambasting folks about paying attention to what happens at the "bare metal" (ha!) level. I encourage you to take your own medicine here.
> Why do you make it seem like it was something more complicated? You must be deliberately hair-pulling just to win a stupid argument. I'm not doing any "virtual object-oriented blah blah" or any kind of types craziness.
Either you wrote something so vague it can be anything you want (e.g., your magical streams that are also not streams). If we remove the nonsensical (or I guess by your admission, non-sequitur) streams bit then you're left with this outcome. I told you exactly what I'd do in my post. You have to recognize the type of the result of each step as either a success or an error.
You've chosen to pretend none of this is about error handling, which is pretty boring IMO. Error handling is what makes code actually hard to write.
> What's wrong with my procedural code? Fill in the error handling between the steps as soon as you have a concrete instance for my abstract example.
I admit I don't know since it seems to mean exactly what you want it to, and no one else is privy to either the viewpoint nor the problem you're actually discussing. You're just here hurling vague insults about how everything other than your "obvious" pseudocode is nonsense.
> Dude, I used a pseudo-code for loop.
Not in any example I reviewed that didn't completely ignore the actual problem to burn down a straw man about streaming constructs. Also, don't call me dude. I'm not.
> I've been there and I know to use FP-ish constructs when they make sense (very rarely. Much easier to restrict to procedural altogether).
Considering you didn't recognize the product pattern match and objected vigorously to the idea that it might be as such; and tuples something nearly every statically typed FP dialect has? I gotta confess that I don't think you know the field or the techniques very well. Fair enough, many modern FP techniques in use today are less than 25 years old. No shame there. But you should probably investigate a bit more before speaking so confidently on the subject.
You really are just advocating for doing what the F# example given was doing, without all that helpful syntax or compile-time verification of correctness & totality. I dunno why you'd want that. Given that you are accidentally reusing FP pattern matches, they can't be that complicated now can they?
> The entire discussion is about railway programming for early exit on error handling. So yeah, there are dependencies there.
"Early" does not have anything to do with "dependency". You can do a million unrelated things in an order of your choice and abort at the first error. There need not be any dependencies.
> returns in functions on modern hardware are atomic over 1 value in all but the most obscure languages, and it's not a new phenomenon.
This is true and totally unrelated to my point.
> No. I'm pointing out you pattern matched against a generic 2-tuple in your pseudocode.
Again, it's pseudo-code, so no, it need not be a "generic 2-tuple", not even needs to be a "tuple" (whatever that means in a given PL). In a real implementation the types are known precisely (Just like the types of the alternatives in an ADT are known precisely). There is no kind of "matching" (there are no conditionals) whatsoever. s1 and s2 are simply two variables (for example pointers, in C) of precisely known type. And the process_data() function returns two values that get assigned to the variables. It's the simplest kind of code.
The whole point of the discussion was whether to design the output as a single stream of ADT values, or as multiple streams of precisely known values. In this context it's totally clear why my (single) assignment to s1 and s2 is not the "matching" that you need to do on the (many) values of the single stream in the ADT version. Really I don't care how hard you want this assignment to s1 and s2 to be a matching/destructuring of a "tuple"; It doesn't change anything and is totally unrelated to my point.
Sorry to call you "dude". I had already decided to remove that word before reading your answer.
> Again, it's pseudo-code, so no, it need not be a "generic 2-tuple", not even needs to be a "tuple" (whatever that means in a given PL). In a real implementation the types are known precisely (Just like the types of the alternatives in an ADT are known precisely). There is no kind of "matching" whatsoever. s1 and s2 are simply two variables (for example pointers, in C) of precisely known type. It's the simplest kind of code.
That's not different syntactically or even mechanically from pattern matching over a generic 2-tuple. So if you want to do this as a social convention, that's fine. I'm not sure why other people would adhere to such a convention.
> The whole point of the discussion was whether to design the output as a single stream of ADT values, or as multiple streams of precisely known values.
No. The whole point of the talk we're discussing is that a single stream of chained operations that have failure states is difficult to cleanly assemble without ADTs. You've backed away from the error contingency in your argument. To the uncharitable onlooker, this appears to be because ADTs solve the problem elegantly and procedural programming turns into am error prone, one-off pile of if-then-else spaghetti. See also: every complaint about Golang error handling.
Ignoring the Success-or-Failure nature of the talk is to fundamentally ignore the problem it's trying to solve. If you DID have two streams of values that had no contingency, we wouldn't write code this way in any runtime with async primitives.
> Sorry to call you "dude". I had already decided to remove that word before reading your answer.
> The whole point of the talk we're discussing is that a single stream of chained operations that have failure states is difficult to cleanly assemble without ADTs
And I suggested that this is overly complicated and procedural approaches are better. In procedural code we don't "assemble chained operations". We do them one-by-one and handle errors appropriately.
And we can typically arrange the data and control so that error handling is much easier. That was my point.
> one-off pile of if-then-else spaghetti.
ADTs and FP techniques provide only syntactic sugar to avoid this spaghetti in the source code. But due to that they just encourage having the worst mess of control actually happen on the machine.
On the other hand, I demonstrated how to avoid this mess by decomposing data into individually homogeneous bins/streams/arrays/whatever. My example does not have a single if statement (yes, you might argue there are if statements hidden in the for loops).
I clearly stated that it's often possible to choose this approach.
> And I suggested that this is overly complicated and procedural approaches are better. In procedural code we don't "assemble chained operations". We do them one-by-one and handle errors appropriately.
Go does this. It's widely and I think rightly criticized for this. It's "easy" to do but not "simple" because it's prone to human error.
> And we can typically arrange the data and control so that error handling is much easier. That was my point.
You can do more work to validate in a separate pass if we're discussing data transforms. It's harder to do this for a series of chained imperative steps. The talk details an example making a series of API queries. It's fantastic to handle a CRUD loop with the ADT method because it lets you write a simple specification and collect errors out in the end, while reserving exception handlers for truly exceptional cases like network faults. F# is incredibly good at this kind of flow.
> ADTs and FP techniques provide only syntactic sugar to avoid this spaghetti in the source code. But due to that they just encourage having the worst mess of control actually happen on the machine.
You've said this multiple times, but it's just not true. The F# code is just a function that calls functions. F#, ML and Haskell compilers all aggressively inline these calls. You pay the exact same costs for validating an error exists in the Golang style, in practice. Indeed, lots of people generate Go from liskov-style templates JUST to get ADTs to make error handling less painful. It's gaining popularity in Swift and Rust as well. Folks there aren't trying to impress people with their FP knowledge, they're cribbing a technique that works and scales well.
> My example does not have a single if statement (yes, you might argue there are if statements hidden in the for loops). I clearly stated that it's often possible to choose this approach.
Your example, by your own admission, doesn't address error handling flows. You need conditional logic to work with operations that don't always succeed. You say, "Structure it so you never have them." I think a lot of FP people would agree with this (in fact, there is a really cool functional pearl about doing exactly this at scale: https://github.com/matt-noonan/gdp-paper/releases/download/j...) but they'd think about it less as a function of assurances and "thinking very hard" and more about using methods to make inconsistent code provably false at compile time.
Very few languages can actually encode formal notions of correctness, right now. Recently languages like Idris, Agda, Coq, and TLA come up as tools for formal correctness, where you can actually have your compiler prove your invariants and their implications.
I'm sorry, I ignored the second part of your question. I was responding to:
> What is an example of exception that happens many times per second?
I made a distinction between exceptions and errors because exceptions are a specific mechanism for propagating errors. I wasn't trying to be pedantic, I was trying to point out that the answer to your first question is more obvious if you subsitute out the word exception for error.
The answer is simply: Because you're probably the one generating the errors and you're going to generate a lot of them.
With regards to the second question:
> And where it could not be easily avoided to step on the error path at all?
I'm not sure how to interpret this question, but I'll throw out an answer anyway. You don't need to take a Railway Oriented Approach. There are other ways. Your reply was an example.
For a streaming scenario, applying a function with the signature ('a -> Result<'b,'e>) to a Stream.map is pretty trivial. It may not be the most performant or scalable, but it's easy to do and easy to reason about. However, one could also easily create two streams as you suggested with a partition type function with the signature: (Stream<'a> -> ('a -> Result<'b,'e>) -> (Stream<'b> * Stream<'e>)
You're free to use the Result<> type as much or as little as it makes sense. How one should best solve the problem depends on the scope of the problem and what you need to do downstream, right?
> I wasn't trying to be pedantic, I was trying to point out...
I agree!
> but it's easy to do and easy to reason about
I agree it's an obvious/intuitive approach to take, just like OOP. However, from a computational and software maintainability standpoint, I think it rarely pays out.
> How one should best solve the problem depends on the scope of the problem and what you need to do downstream
That's right, however ADTs are very rarely occurring naturally in what I do (I've dabbled in quite a few areas).
I've found them to occur naturally in parsers (because languages have forks in their syntaxes), but even there I've had great gains by avoiding ASTs altogether, and making multiple streams, supported by a ADT stream of simple data that does not contain real data but only "joins" the other ones (it contains only a type-flag and a pointer into one of the other streams).
I think ADTs are the right choice whenever the ordering between a set of things with heterogeneous types is very important. If that's not the case, code can be vastly simplified by sorting the data by their types and processing them as separate bins - separately and independently.
There is no performance penalty associated with try/catch. At least in JVM languages, try is free (its a compile time phenomenon) and throw is also free (memjmps), its only the creation of the exception that is expensive and this can be avoided by pre-allocating an exception (at the cost of no stacktrace info).
Developers are better off in the end with proper exceptions because these sorts of Result objects don't scale. Haskell ultimately embraces proper exceptions and I suspect Rust will get there too eventually. F# actually has proper exceptions which makes you wonder why the author is reinventing the wheel badly.
There are different ways to implement exceptions with different performance trade-offs.
Mechanism used by CPython (and IIRC many C++ implementations) involves having per-thread flag "exception occurred" which is checked after every function return (this makes sense performance-wise when essentially every function contains "implicit try/finally" in form of bunch of Py_DECREF or destructors of locally scoped variables).
The same idea can be implemented by returning instances of special "ExceptionThrown" class and checking for that, which you in fact can do by doing "if argument is ExceptionThrown return argument" in prologue of every function, which is exactly the thing described in the article. (early versions of dfsch did exactly this)
Only when you know that in most cases you don't need to do significant cleanup on stack-unwind (ie. you have tracing GC that directly scans stack or you don't allocate that much) it starts to make sense to do stack unwinding by doing non-local jumps.
Typical implementation of that has some linked-list of currently active try/catch/finally sites that has to be maintained at runtime (typically as part of activation records on control stack). Extension of this idea is that such sites do not necessarilly have to unwind-stack before running the recovery code (i.e. Windows' SEH). And then that you can have two such lists, one for handling errors and other for cleanup during stack unwind (i.e. CL-style condition system)
So the author isn't reinventing the wheel. I should have explained that Railway programming isn't supposed to replace doing try/catch, it's typically best suited for scenarios where errors are not exceptional. It works best when errors are occurring very regularly, are part of the logic of the system, and where you want to ensure the programmer takes advantage of F#'s exhaustive pattern matching to ensure developers account for all the scenarios.
In this scenario, performance matters because you want the creation of the error (these aren't exceptions) to be relatively cheap because it is a regular part of the system.
In these scenarios, using exceptions would be abusing exceptions, because exceptions are typically supposed to be exceptional.
For a web server under high load, exceptions are extremely expensive to build the stack frame as mentioned below. The rule of thumb is to reserve Exceptions for Exceptional Circumstances (oom, unable to connect to db etc) and to use a pattern like in this article for un-exceptional exceptions such as required field validation errors. Having done this clean up a few times, it makes a huge difference.
I've had a guy come in on an F# project that used exceptions in a relatively data intensive workflow and rewrite it to use Results like this.
The amount of boxing and unboxing as you go through the various switches actually can ultimately make this much less efficient than the exception based flow. The code was ultimately something like 8 times as slow, and topped out CPU on beastly boxes which was surprising to me.
Performance myths like this are really interesting. Usage of features is usually not inherently responsible for bad performance, but the way they are used. Usage of these features can amplify the effects of bad software architecture. Some features might just be elected as the culprit while there is really a different problem.
If a little boxing and unboxing in a functional language takes 7 times as long as the rest of the code, something must be very very wrong. It's hard to believe that.
In this concrete case: a data intensive workflow should really have no or only few error situations. The performance of exceptions should not matter at all. I would guess that reworking the code to avoid exceptions also included reworking the program structure.
I don't believe the boxing/unboxing performance claim either. I built a large project in this style (in F# specificially) and the efficiency is not a problem. It's possible said person built his custom implementation that had issues but if you use ILSpy to decompile how union types end up compiled you'll see it comes down to a boolean check against an integer followed by a pointer dereference.
OCaml has very fast exception throwing/handling mechanism which apparently is why some of the applications built in OCaml make heavy use of exceptions for control flow (which as been mentioned by others, if you push too far is a nightmare to to reason about). The appropriate strategy to port such code to F# for example (which is a highly similar language to OCaml) is to switch to unions in the return type.
The approach advocated by Scott is a general pattern with which you build an entire application around. It does marvel when building a system surrounded by others that you cannot trust and database with brittle consistency (as most 15+ years line of business database eventually become).
As for some of the other suggestion advocated in the comments, you have to note that composability becomes a factor. If you are dealing with a collection of entries each of which might fail or succeed, the railway oriented programming approach scales to handling collection of errors/successes gracefully which so many error handling strategies fail to meet.
I have tried designing an application around Haskell's error monads in the past, and it was an absolute disaster. There was so much typing. The types became really hard to understand and the program structure became really rigid. I do not know that it can't be done better, but I've heard prominent Haskell guys say that combining errors (or any monads at at all) sucks in practice.
And I've got zero problems with just handling errors procedurally. I realize that errors simply combine very badly: If there are many possible error kinds the best you can do in the end is usually to just quit. That's not what they promised on the tin :-)
So in the end I just write procedural code and I'm careful not to get into situations where many different kinds of errors can happen. I select a few error cases that I care enough about to handle. Because handling even a single kind of error means a lot of complexity on top of a software project.
Yes I agree it can become unwieldy if it is used throughout an application. I think the sweet spot is too keep the pattern at the application boundary. You can use business objects the way Scott describes them for the internals without threading Either/Results monads throughout your application. Moving calls to services and databases to the application boundary also helps keeps the internals (where most of the business logic will reside) clean of Either/Results. For the few things left that could fail in the internals, stick to exceptions mechanism and avoid catching. I'm a big believer bugs should make the application crash.
You're describing a highly sub-optimal solution if there was performance degredation, and lots boxing/unboxing is a code smell.
The biggest Exception penalty is only paid once when the exception code needs to be JITed along with more minor performance issues. Dollars for donuts, any CLR language will be faster using not-Exceptions, so a refactoring to model the exception path would have to do a fair bit work in unrelated areas to be slower.
I think you misunderstood his arguments. He's stating that favouring a Result-type monad (as opposed to using exceptions that may never occurs) leads to boxing of values inside a tag of a union-type. Threading exceptions handling where exceptions are never thrown in practice typically little cost.
However this is really different from boxing of value-types in a situation where you don't have access to generic collections for instance and as such we're talking about a different cost. Unless you're entire application is passing integers or floats around (and not proper objects/records) it's highly unlikely the pattern is causing the situation he described.
This just isn't true at all. Exceptions use a totally different mechanism from railway-oriented programming, have very different constraints, and different performance characteristics.
Please don't entertain these silly posts. At best, it only grants legitimacy to a crassly presented idea.
I'm wondering: why not make every returned value implicitly of type Either with Left some generic type?
In most cases, I'm not interested in the types of exception that can be thrown. And when I need the type of an exception, a runtime check suffices in almost all cases.
Let's not make exception handling more cumbersome than it needs to be, and only invoke verbose mechanisms when necessary.
> I'm wondering: why not make every returned value implicitly of type Either with Left some generic type?
Because then you lose the ability to have code that doesn't error. If you explicitly mark those parts of your code that can error, you can then start to decouple those parts that can error from those that will never error. E.g. you can separate the business logic which figures out which database query you need to run (which can't error) from the code that actually performs database queries (which contains timeouts, retry logic etc.), and then you can test those two things separately and make sure you've covered all the cases. Whereas with pervasive/implicit exceptions, every line of your business logic might throw an exception (or might not throw an exception now, but later be refactored to throw one), so you need to test a combinatorial number of cases (or, more likely, your tests won't cover some code paths).
Wrote a reply to a deleted comment, figured it was worth keeping to expand on my point:
> - what if the tables / schema / etc doesn't exist?
Then figuring out which query to run won't fail, executing the query will fail; the whole point was to separate the two.
> - what if the logic which selects the DB query returns no query?
You make that impossible, if that's not a valid answer. (If that is a valid answer then you use a return type that represents that - Maybe/Option - and then you're forced to handle that porperly). Most programming languages don't even permit a function to "return no value" so that's already not a problem. (Some programming languages permit a function to return "null" or loop forever; don't use those languages).
> - what if the logic attempts to pull a value from an out of bounds array (or similar fail)? Sure you would check the bounds before querying the array but why did you get to a situation where you tried to pull an out of bounds index in the first place?
So you don't get into that situation: you only use array indices that are valid by construction, you don't give yourself any way to construct an invalid index. (Ensuring that indices into a specific array are a different type from indices into any other array is a useful first step. Of course, in practice you probably avoid using arrays and indices at all; if you use higher-level operations like map and reduce then there aren't any indices to worry about).
(Very occasionally there may be cases where you can't avoid having to do something that might error, in which case what you want to do is fail-fast, erlang style. But if you're doing what I'm advocating, this will be your only option: the point of all the above is that we make invalid states unrepresentable, so when we get into an invalid state it's simply impossible to continue. And really the vast majority of the time - every time actually, in my experience - you can find a way to write the code so that it can't error, if you actually try).
Realistically, almost all but the simplest code can produce an error, starting from out-of-memory errors.
You can break up code according to many criteria, but doing it based on whether the code throws an error or not seems not very practical, since you can break up code only once.
Further you could have smart test tools that can figure out what exceptions can be thrown in what circumstances. You don't need to bother the programmer with them.
> Realistically, almost all but the simplest code can produce an error, starting from out-of-memory errors.
If you care about handling out-of-memory errors then you absolutely do want to distinguish between code that allocates and code that does not. The vast majority of application code is content to treat out-of-memory as an impossible condition and fail-fast in the case where it does happen. So sure, technically the distinction is between code that can produce errors that you want to handle and code that cannot produce errors that you want to handle, but the point goes through all the same.
> You can break up code according to many criteria, but doing it based on whether the code throws an error or not seems not very practical, since you can break up code only once.
Nonsense. You can, and should, decompose your code along multiple axes, including any effects, of which error handling is indeed only one.
> Further you could have smart test tools that can figure out what exceptions can be thrown in what circumstances. You don't need to bother the programmer with them.
"smart test tools" would amount to an ad-hoc, informally specified, bug-ridden implementation of half a type system. The distinction between code that can error and code that cannot should be as lightweight as possible, but it mustn't be completely elided, because the programmer needs to be able to see it when they're working on the code. The "=" versus "<-" distinction in Haskell-style "do notation" is the best compromise I've seen: it's minimally intrusive, but it is visible.
Yes, I understand. But by "implicitly" I really meant that. You don't see the type, only the compiler does. The programmer only sees the non-error part of the type (the "Right" part of Either).
> Either is a sum type. What if I don't want that.
Product types work as well, if you want. But unless your return type is a monoid (or like, a semilattice maybe?) then you're going to end up with Sum Types somewhere.
The vast, vast majority of software engineers and students work with sum types every day and they're totally fine with them. Most languages implicitly sum many values with null.
These sounds like exceptions rather than errors to me. Difference being, exceptions truly are exceptional and not anticipated; like running out of memory. How do you even handle that? Would probably let it crash and restart (Erlang model). Errors is something you anticipate can happen, such as getting 4xx-5xx back from the server and know what to do in those cases.
Then it's fair to say that you definitely can't get an error by adding two numbers.
I think that this uses a very specific model which can be represented with railroads, but also any other 1-way system. For example, you could say it's like hanging from a glider where you travel in a happy path, or can fall down at any point and keep travelling on foot on a failure path. This analogy is not much better but I feel like there's a better analogy out there than railroads.
IMO the railway analogy works well, because it describes the right dimensionality. It's sort of a 1.5D problem with a directed graph, with a chain of events plus a fallback chain. A railway captures that perfectly while giving people a clear picture they can understand, assuming they imagine a train that can only move in one direction and decides its path based on switches on the track. With something like a glider you could ask questions like "but what if it flies off to the East" or "what about vertical air movements that carry it up or down" and so on...
I agree that's a problem with the glider image, but that's the issue I had with railroads as well—there's much more freedom than what is shown here; railroads can go multiple ways, join in various orders etc. What I mean is one line change in a given direction does not imply subsequent ones will be in the same.
What railroads evoke is that there's a very limited number of paths you can take. A glider calls to mind just the opposite - complete freedom in where you fly to.
It is time someone wrote an essay arguing for non-oriented programming (NOP?) and requirements driven development (RDD?). In my darker moments, I tend to see some of these trends in the same light as certain diet fads...
This style of programming has existed for about as long as Python has, and has seen substantial improvements both in classical environments (ML & Haskell) and experimental environments (Idris and Agda).
It's also increasingly popular in languages with less exclusively academic DNA. Swift's community is considering adopting a moral equivalent. Rust uses a very Haskell-ish variant. Golang uses a related variety cribbed from Common Lisp called "product-type errors" (as opposed to "sum-type errors" used here).
If this is a fad, then so is Python, Golang, and Java.
Common Lisp multiple values are not used for error handling as far as idiomatic code is concerned. The also work differently, being symetrical to optional parameters.
It's not common, but it's doable and Golang copied the approach for its error handling, which is what I was referring to more directly.
But, people absolutely use MVB in cases where an ADT would work. Especially in FFIs where CL's restarts would make the underlying bindings way more complex or underperform. I can go find some examples when I'm off hours, if you like.
I'm surprised there's not a single mention of JavaScript promises in the post. This is basically what they are, and their syntax naturally imposes this technique as a solution to most use-cases.
Here [0] is an example of how I've used this technique in functional JavaScript code. Please don't judge too hard as this was my first JS library 2+ years ago... certainly lots of problems in there (although it has been running untouched in production since then!)
What they are is monads - the post just avoids using the term so as not to scare people off. Monads generalise the railway-oriented style between error-handling, async, and many other things. That's their whole point, and that's why they (and functional programming generally) were a big deal years before JavaScript even existed.
One thing that is nice about this site in particular, is that it goes out of its way to avoid the kind of FP jargon that confuses the hell out of most people. Sometimes the analogies strain a little bit, but at least for me, it's a thousand times more approachable than descending into math-babble.
I find the "FP jargon" and "math-babble" to be annoying, but not nearly annoying as the sort of infantilization sites like this engage in. It's a common trope to "dumb down" complex topics into cartoonish explanations, but in reality it only makes the content more confusing, while simultaneously insulting its readers by treating them like children.
No, I don't need a "cartoon intro to DNS over HTTP" to understand basic networking concepts. Nor do I need a story about "Dr Frankenfunctor and the Monadster" to understand functional programming concepts. Even when I was a teenager reading about OOP, I didn't want to read about the "Car class inheriting from the Vehicle class" or the "Dog class inheriting from the Animal class."
When reading technical content, I always prefer prose that treats its readers as professionals, rather than five-year-olds incapable of understanding complexity without resorting to vacuous and incomplete analogies. The problem is that complex topics are complex. By their very nature they rarely fit into convoluted analogies, and would be better explained by simple use cases and examples.
Yeah, this website doesn't dumb things down. That page is an exception really. It just deliberately avoids using the common FP words that don't really have any meaning to 99% of programmers.
As this huge, detailed, functional programming resource demonstrates, it's perfectly possible to write clear and detailed functional programming guides without that jargon.
This criticism doesn't really apply to the site under discussion...
F# for Fuin and profit has oodles of pragmatic, every-day, help and information in addition to a repository of relatively 'esoteric' issues for FP (particularly on .NET), for work-a-day coders. It's a deeply accessible guide and does an outstanding job of reframing techniques unknown to OOP devs in terms they understand while leaving deeper mathematical issues to their appropriate forums.
I'm also gonna take issue with calling FP terms "math-babble". It's just called "math" ;)
A promise would be the case in an async functions but railway like presented work in sync functions.
Promises also sorta recursively build up and tear down (IIRC) while railway is a "pass along" situation; each function that generates and error calls the next function with the error where you then simply pass it along (or handle it).
Here, each of the functions can return a result (async or not) or reject with Promise.reject or by throwing. If any of the functions rejects, the rest of the functions in this "happy path" are not called.
Each .then replaces the promise before, instead of wrapping the source promise around it. At each step you have a resolved value, with a monad you'd have to .flatMap because mappining a monad to a monad nests it, which you have to later flatMap to extract the boxed values. Compare rxjs streams which are monads to promises which are functors. monads are functors but not vice versa.
Not sure what you mean by "build up and tear down" but in what way do promises not pass errors to the next function in the chain?
Consider code like this:
p1
.then((value) => {
console.log(value);
return Promise.reject('oh, no!');
})
.catch((e) => {
console.log(e);
// Handle the error if you want, and keep going
return Promise.resolve()
})
.then(() => {
console.log('This still gets executed')
})
How is this not amenable to the railway pattern? You define how you handle the error within a function in the promise chain. Error handlers are fully composable with success handlers, which seems to be the important element necessary for implementing the railway pattern.
That's not quite true. `then` is a weird hybrid of `map` and `bind`. If its callback returns a Promise it acts like `bind` (you don't get a nested Promise), but otherwise it acts like `map`.
And to be pedantic, Promise is neither a functor nor monad because `then` doesn't follow the necessary laws.
I know you're being facetious, but given that Bash functions have return values and also process stdin/stdout, they could be adapted to implement the railroad pattern, yes?
It's not quite the same thing ... the either type is just a simple value which you can think about as just having a boolean saying "is this a successful or unsuccessful result?", and depending on the success attach different data. E.g. a result of parsing JSON from a string would be "successful: true + the parsed object" or "successful: false + the error message".
The railway oriented approach is then looking at neat ways of working with these kind of results in the place of try/catch when dealing with errors.
You can then centralize the error handling by having a "stderr"-like output on your filters.
When you have a call/return architectural style, you need to return something, and thread the result along, so you need to do something more complex like this pattern.