I've read the proposals for it. It certainly looks good, yet exception handling looked good 30 years ago, too. I haven't used sum types myself, and often it takes years to discern whether things are really good ideas or not.
What I personally use is the "poisoning" technique. This involves marking an object as being in an error state, much like a floating point value can be in a NaN state. Any operation on a poisoned object produces another poisoned object, until eventually this is dealt with at some point in the program.
I've had satisfactory success with this technique. It does have a lot of parallels with the sum type method.
I'm experimenting with a solution in C3 that has this behaviour. I don't have a sum type as such, but the binding acts as one. I call the binding a "failable".
int! a = getMayError();
// a is now either an int, or contains an error value.
// foo(a) is only conditionally invoked.
int! b = foo(a);
// The above works as if it was written:
// int! b = "if a has error" ? "the error of a" : foo("real value of a");
// A single if-catch with return will implicitly unwrap:
if (catch err = b) {
/* handle errors */
return;
}
/* b is unwrapped implicitly from here on and is treated as int */
if (try int x = a) {
/* conditionally execute if a isn't an error */
}
I think this is kind of formalizing the poison technique but external from the call (that is, "foo" does not need to know about "failables" or return one, the call is skipped on the caller side). Here are some more examples: http://www.c3-lang.org/errorhandling/
I'd be interested in hearing what you think about this (experimental) solution Walter.
So you’ve built in monadic bind for the Either monad into the language:
Right x >>= f = Right (f x) -- normal case
Left y >>= f = Left y -- error propagation case
(The slogan is “monadic bind is an overload for the semicolon”.)
I don’t expect this knowledge will dramatically change what you’re doing, but now that you know that’s how some people call it you have one more place to steal ideas from :)
No I'm quite aware of this. It's a restricted, implicit variant of it. But not also that it's not the type but the binding, which makes it slightly different from using a `Result`.
Hm. OK. I tried writing a response several times but I still feel confused. Can you explain what you mean by “not the type but the binding”? Note that I know the Haskell but not the Rust (guessing from the “Result” name) way of working in this style.
(Not necessarily relevant or correct thoughts:
- Your language still seems to mark potentially-failed values in the type system, even if it writes them T! not Either Error T or Result<Error, T>;
- The way Haskell’s do-notation [apparently implemented as a macro package in Rust] is centred around name binding seems very close to what you’re doing, although it [being monadic, not applicative] insists on sequencing everything, so fails the whole block immediately once an error value occurs;
- Of course, transparently morphing a T-or-error into a T after a check for an error either needs to be built into the language or requires a much stronger type system; Haskell circumvents this by saying that x <- ... either gives you a genuine T or returns failure immediately, which is indeed not quite what you’re doing.)
What I mean by saying "it's a binding" is that it is a property of the variable (or return channel of a function) rather than a real sum type. Consequently it does not participate in any type conversions and you cannot pass something "of type int!" because the type does not exist.
Here is an example:
int! x = ...
int*! y = &x;
int**! z = &y;
// If it had been a type then
// int!* y = &x;
// int!** z = &y;
// int*! y = &x;
// means
// int*! y = "if x is err" ? "error of x"
// : "the address holding the int of x"
This also means that `int!` cannot ever be a parameter, nor a type of a member inside of a struct or union.
The underlying implementation is basically that for a variable `int! x` what is actually stored is:
// int! x;
int x$real;
ErrCode x$err;
// int*! y;
int* y$real;
ErrCode y$err;
// y = &x;
if (x$err) {
y$err = x$err;
} else {
y$real = &x$real;
}
int z;
// y = &z;
y$err = 0;
y$real = &x;
The semantics resulting from this is different from if `int!` had been something like
struct IntErr {
bool is_err_tag;
union {
int value;
ErrCode error;
};
};
Which is what a Result based solution would work like. In such a solution:
int! x ... ;
int!* y = &x; // Ok
int z = ...
y = &z; // <- Type error!
You should give a serious try to sum types, btw. They're unambiguously good, and have been in use for the last 40 years at least. To me, not having them is an immediate disqualifier for a modern static language (along with some basic form of pattern matching that goes hand in hand with them).
So that sounds like the way invalid floating point operations give NaN, and then the NaN propagates everywhere. I've always found this super annoying because its often hard to figure out where the NaN comes from. Does your solution differ from this in a way that's less annoying?
The FPU does not include the source in the NaN, but that doesn't mean your own objects can't.
What I do is have the error reported at the source, and then return the poisoned object. A better way would possibly be put the error message in the poisoned object, and report the error somewhere up the call stack.
typedef struct
{
err_t error;
int error_line;
char *error_msg;
...
...
} thing_t;
// set out of range error
thing->error = THING_ERROR_OOR;
thing->error_line = __LINE__;
thing->error_msg = "outofrange"
You can grep on 'outofrange' and find where the error was set.
I originally started doing that to mark 'bad' analog readings in process control equipment. I wrote my filters and control loops to be able to 'eat' occasional bad readings without barfing. Worked very well.
> What are the return values from a poisoned object’s methods?
That's up to you. You can do it as:
1. return a poisoned value
2. return a safe value, like `0` for its size
3. treat it as a programming bug, and assert fail
4. I know `null` is hated, but it is the ultimate poisoned value. Try to call a method on it, and you are rewarded with a seg fault, which can be considered an assert fail.
5. design your poisoned object to be a real object, with working methods and all. It can be the same as the object's default initialized state.
In other words, it's necessary to think about what the poisoned state means for your use case. I use all those methods as appropriate.
What I personally use is the "poisoning" technique. This involves marking an object as being in an error state, much like a floating point value can be in a NaN state. Any operation on a poisoned object produces another poisoned object, until eventually this is dealt with at some point in the program.
I've had satisfactory success with this technique. It does have a lot of parallels with the sum type method.