Considering how difficult it is to handle errors in C++, I am not sure you can s...

alexchamberlain · on Dec 26, 2012

The iterator for a vector is a class... although quite a thin wrapper around a pointer, it is still a class.

The difficulty to handle errors in C++ is generally bad API design left over from the C days. Exceptions used correctly make error handling exceptionally easy.

AnthonyMouse · on Dec 27, 2012

Your criticisms would be well taken if they were actually true:

>Exceptions can be used, except that they cannot propagate out of destructors

Exceptions are perfectly well allowed to propagate out of destructors. A destructor throwing an exception is only considered ill advised, because destructors are called during stack unwinding when another exception is propagating, and throwing an exception during another exception causes program termination. And object destruction is largely incapable of truly failing anyway: There will be no state remaining to be left inconsistent because the object is going away, unless the destructor is interacting with an object not being destroyed, in which case the error is actually with the other object and information about the error can be stored with that object or in an error queue and handled in due course after the stack unwinding is complete.

Moreover, what's your solution to the problem? If the destructor is doing nothing but deallocating memory (and calling other destructors that only deallocate memory) then it never needs to throw and there is no problem. If it's doing something else then it's either doing something that another language wouldn't allow because it has no destructors (in which case you can refrain from doing it in C++ as easily as you can switch to one of those languages), or the other language has an equivalent to destructors and then will have the same issue with throwing during exception propagation. Do you see some solution to having it both ways?

>they should not propagate out of constructors

Rubbish. Constructors don't even have return values. Throwing exceptions is the canonical way of indicating construction failure.

The primary benefit to non-throwing constructors is that it allows certain performance optimizations. For example, if a constructor throws during a std::vector resize operation, the vector class will undo everything that had been done during the resize operation so as to leave the vector in a consistent state. This is easy with copy constructors: Just destroy the copies and keep the originals. But if vector used the move constructor to move the existing elements to the newly allocated internal array then the originals are no longer valid. So vector's resize will only use the move constructor instead of the copy constructor if the move constructor is declared noexcept (or the compiler can statically determine that it doesn't throw), since the alternative would violate the vector's ability to maintain consist state when a move constructor throws an exception.

>there is no standardized way to retry the operation that threw the exception.

How about catching the exception, addressing it, and retrying the operation in the try block?

What do other languages do that you feel is superior?

> The default numeric type in C++ is fixed-width (int or floating point)

Why do you believe this to be a serious limitation? The range of a 64-bit type is more than sufficient for the overwhelming majority of applications and for the few remaining with specialized needs (you know who you are, cryptographers and mathematicians), arbitrary precision libraries are readily available.

> the default string type is a primitive pointer (const char *)

The default string type is std::string. Which can even be used with the large bulk of the C library through the helper function string::c_str() which provides a "safe" temporary const char array for passing to C library functions.

betterunix · on Dec 27, 2012

"Exceptions are perfectly well allowed to propagate out of destructors"

...and the default behavior is program termination.

"destructors are called during stack unwinding when another exception is propagating, and throwing an exception during another exception causes program termination"

That was true in C++98. In C++11, the default behavior of a destructor is to call std::terminate if an exception propagates out of the destructor. This must be explicitly overridden by the programmer in order to get C++98-style behavior, which is undefined (not sure if that is better or worse than being defined as "unconditional program termination").

This does not need to be a problem. If the stack were unwound after the catch block exits (or if it explicitly unwinds the stack), there would be no double exception faults. This is not something that needs to be inefficient, nor does it need to impact performance any worse than exceptions do right now, and it has already been implemented in other languages (e.g. Common Lisp).

"what's your solution to the problem?"

See above.

"If the destructor is doing nothing but deallocating memory (and calling other destructors that only deallocate memory) then it never needs to throw and there is no problem"

Except that destructors can free other resources, and some of those other resources might throw exceptions when they are freed. A destructor might, for example, close a file; if the file cannot be synchronized when it is closed, an exception should be thrown (but the C++ standard actually requires that such an exception be ignored; after all, the programmer's job is to manually close files, right?).

"If it's doing something else then it's either doing something that another language wouldn't allow because it has no destructors (in which case you can refrain from doing it in C++ as easily as you can switch to one of those languages), or the other language has an equivalent to destructors and then will have the same issue with throwing during exception propagation. Do you see some solution to having it both ways?"

See above! There should not be any program code that cannot safely throw an exception to indicate an error, and the solution is to change the order of stack unwinding and catch block execution. C++ is not the only language with a double-exception problem; Java and Python have this problem also (e.g. in Java, if you have try...finally without any catch and the finally block throws an exception). The other benefit of this approach, aside from solving the double exception problem, is that it makes exceptions more useful by making "restarts" possible: the catch block can resume execution from some point defined by the thrower of the exception, which helps a lot with encapsulation (if e.g. the error can be corrected, but only the client code can correct it; for example, writing to an external disk that was disconnected before the operation could complete).

The only complication is that the catch block needs a stack of its own, and that the catch block needs to have a pointer to the stack frame it should return to if the rest of the stack should be unwound. These are not major challenges, certainly not by comparison with the other challenges C++ compiler writers need to deal with.

"Rubbish. Constructors don't even have return values. Throwing exceptions is the canonical way of indicating construction failure."

Oh, not having return values justifies using exceptions? Destructors don't have return values. How do destructors indicate failures?

Constructor exceptions are bad because they can cause program termination when objects are constructed in the global scope. Global objects are generally bad, of course, since their construction order is undefined, but they are allowed and they are sometimes used by the standard (e.g. cin, cout). This is not a hard one to solve: give programmers a way to catch exceptions that are thrown before main is called or after main exits.

"How about catching the exception, addressing it, and retrying the operation in the try block?"

Except that forces the client code to understand how to retry an operation, which breaks encapsulation, makes error recovery more complicated (if the exception is thrown in the middle of writing some record, you now need a way to either roll back the write, figure out where the incomplete record begins, or figure out where the last write operation failed).

"What do other languages do that you feel is superior?"

Common Lisp's "conditions" system, for one, or any language that supports continuations.

"The range of a 64-bit type is more than sufficient for the overwhelming majority of applications"

Except that integer overflow vulnerabilities are common and problematic, and occur with 64-bit integer types. Saying that 64 bits is enough for any application is kind of like saying that 640k is enough RAM for any application. If you want to use integer arithmetic, use the integers -- which means arbitrary width, or else an exception being thrown if you try to represent an integer that is too big.

"for the few remaining with specialized needs (you know who you are, cryptographers and mathematicians), arbitrary precision libraries are readily available."

Having code that does not do unexpected things is not exactly a "specialized need." It is pretty easy for a programmer to forget that the sum of two positive numbers could actually be negative, or that the product of two positive numbers could be positive but less than the two factors. Most people think of integer arithmetic in terms of integers, not two's complement, even when they are programming in a language with fixed-width arithmetic -- and it takes extra mental effort to remember that you are not really dealing with integers (mental effort that could be used for other things).

Sure, you can use a library -- but then you need to be explicit about wanting arbitrary precision, and you don't get any automatic optimization from your compiler (e.g. if your compiler can prove that some value will always be in a particular range, it might remove an arbitrary precision operation and replace it with a fixed-width operation without creating a risk of an overflow).

"The default string type is std::string"

Unless you declare a string constant, in which case you get a pointer to an array of characters. C++11 conveniently adds yet another way to get a pointer to an array of characters, "raw" string literals (at least it is appropriately named), on top of the typical double-quoted string and the C++03 wide-character string literal notation.

Even with std::string, bounds checking is not guaranteed. operator[] won't check bounds. String iterators can be incremented past the bounds of the string without any exceptions being thrown. You can even try to dereference the placeholder iterator returned by string::end(), with conveniently undefined behavior (would any definition actually make sense for some application or platform?). You have at(), of course, which will do bounds checking but which breaks the [] syntax you get with C-style strings, arrays, maps, and other types (at least you have the same interface as std::vector, which might be useful).

This is the sort of brittleness that characterizes C++ and that makes reliable code hard to write. Everything looks great, until you step over the poorly-indicated boundary that separates "good for C++" and "you'll regret this later."

AnthonyMouse · on Dec 27, 2012

>...and the default behavior is program termination.

Because it's generally a bad idea. But you indicated it couldn't be done, which is different.

>This does not need to be a problem. If the stack were unwound after the catch block exits (or if it explicitly unwinds the stack), there would be no double exception faults.

But it would bring all kinds of new trouble:

  void lock_and_throw(std::mutex& m)
  {
	std::lock_guard lock(m);
	throw exception();
  } 
  class mutex_wrapper 
  {
  private:
	std::mutex *m_ptr;
  public:
	mutex_wrapper() {
		m_ptr = new std::mutex;
		try {
			lock_and_throw(*m_ptr);
		} catch(…) {
			// undo incomplete construction as destructor will not be called
			delete m_ptr; 
			throw;
		}
	}
	~mutex_wrapper() { delete m_ptr; }
  }

If you unwind the stack after the catch block, the lock_guard destructor inside of lock_and_throw() gets executed after the mutex has already been deleted. So either the mutex_wrapper constructor can't delete m_ptr from the catch block (and then where is it supposed to do it when it wants to re-throw the exception?) or lock_and_throw can't assume that an otherwise valid argument passed to it won't be deleted out from under its stack variables' destructors whenever any exception is thrown.

And if you run the catch block and then go back and unwind the stack, it still leaves the question of what happens when a destructor throws during stack unwinding. Do you call the same catch block again (and then have to check for possible double free etc.), or do you let an exception that was surrounded by catch(…) escape out to the calling function just because it was already called once? Neither seems pleasant.

>Constructor exceptions are bad because they can cause program termination when objects are constructed in the global scope. Global objects are generally bad, of course, since their construction order is undefined, but they are allowed and they are sometimes used by the standard (e.g. cin, cout). This is not a hard one to solve: give programmers a way to catch exceptions that are thrown before main is called or after main exits.

That's a fair suggestion for how to improve the language, but to go from there to arguing that constructors shouldn't throw exceptions doesn't follow. The conclusion should be instead that global variables shouldn't be instantiated using constructors that throw exceptions unless program termination is the desired result when such an exception occurs -- which it actually is in most cases, because having uninitialized global variables floating around is unwise to say the least.

Also, if it should ever become an actual problem in a specific instance, there is always the possibility of initializing the object with a different, non-throwing constructor and then having main reinitialize it on startup and catch any exceptions.

>Saying that 64 bits is enough for any application is kind of like saying that 640k is enough RAM for any application.

I didn't say any application. There are obviously some applications that require arbitrary precision. But most applications never need to count as high as 2,305,843,009,213,693,952. It's a pretty big number. If you counted a billion objects a second you still wouldn't get there for hundreds of years. So why pay for it constantly if you so rarely need it?

>Sure, you can use a library -- but then you need to be explicit about wanting arbitrary precision, and you don't get any automatic optimization from your compiler (e.g. if your compiler can prove that some value will always be in a particular range, it might remove an arbitrary precision operation and replace it with a fixed-width operation without creating a risk of an overflow).

The cost of not using fixed precision the times where the compiler can't optimize it out is guaranteed to be worse than the cost of not being able to optimize it as often when you've specifically asked for arbitrary precision. Moreover, who says the compiler can't optimize out the arbitrary precision library calls in cases where the values are known at compile time just because the library isn't an official part of the language? The calls are likely to be short enough to be inlined and then if all the values are static the optimizer has a good shot at figuring it out.

>Unless you declare a string constant, in which case you get a pointer to an array of characters.

String literals as const char arrays are mostly harmless because the compiler ensures that they're "right" -- if they're const then you can't accidentally go writing past the end because you can't accidentally write at all, and the compiler puts the '\0' at the end itself so that silly functions that keep reading until they find one will find it before running out of bounds. And likewise for c_str() from std::string.

Moreover, bounds checking is pretty easy if you want it:

  template<class T, class E = T::value_type>
  class safe : public T
  {
  public:
	E& operator[](size_t index) { 
		if(index < size())
			return T::operator[](index);
		else
			throw out_of_bounds_exception();
	}
	// and so on for other common accessor functions
  }

  safe< std::basic_string<char> > s;
  safe< std::vector<int> > v;
  // etc.

But naturally then you have to pay for it in performance.

betterunix · on Dec 27, 2012

"If you unwind the stack after the catch block, the lock_guard destructor inside of lock_and_throw() gets executed after the mutex has already been deleted. So either the mutex_wrapper constructor can't delete m_ptr from the catch block (and then where is it supposed to do it when it wants to re-throw the exception?) or lock_and_throw can't assume that an otherwise valid argument passed to it won't be deleted out from under its stack variables' destructors whenever any exception is thrown."

That sounds like the sort of memory management problem that smart pointers are supposed to solve. It looks like a smart pointer would be exactly the sort of thing you would want here: when and if the stack is ultimately unwound, the lock_guard object will be destroyed first, then the smart pointer (because unwinding a constructor will cause the destructors of member objects to be invoked; note that your code throws the exception out of the constructor, and so the stack would have to be unwound at a higher level catch). The problem is not with unwinding the stack at the end of the catch block (which would not even be reached in your example, because of the throw in the catch block); the problem is that you explicitly deleted the pointer before the stack would have been unwound.

"And if you run the catch block and then go back and unwind the stack, it still leaves the question of what happens when a destructor throws during stack unwinding. Do you call the same catch block again (and then have to check for possible double free etc.), or do you let an exception that was surrounded by catch(…) escape out to the calling function just because it was already called once? Neither seems pleasant."

Really? It sounds like the second option would be what you would want: if the stack was unwound implicitly after the body of the catch block had executed, and unwinding the stack caused an exception to be thrown, then the exception was thrown out of the catch block. How is that unpleasant?

"The conclusion should be instead that global variables shouldn't be instantiated using constructors that throw exceptions unless program termination is the desired result when such an exception occurs -- which it actually is in most cases, because having uninitialized global variables floating around is unwise to say the least."

Except that a failure to initialize should at least be reported to the user. Maybe you could not get a network connection, or you could not allocate memory, or there is a missing file -- whatever it is, the user should know, and the thrower of the exception should not be responsible for telling the user. If you have restarts, you get something better -- you get the chance to try the operation again, which might be good if you have a long start-up process.

"Also, if it should ever become an actual problem in a specific instance, there is always the possibility of initializing the object with a different, non-throwing constructor and then having main reinitialize it on startup and catch any exceptions."

In other words, all classes should provide a non-throwing constructor and an initialization routine, because any object might be constructed in the global scope.

"There are obviously some applications that require arbitrary precision. But most applications never need to count as high as 2,305,843,009,213,693,952."

It is not just about counting high. If an application adds or multiplies two numbers, there is a chance of an overflow. If the application is reading one of the operands from the user, that overflow could be a security problem -- such problems are frequently reported.

"So why pay for it constantly if you so rarely need it?"

You don't have to pay for it constantly; you can have fixed-width types as something the programmer explicitly requests, or as something the compiler generates as an optimization. The real question is, why should the default type be the least safe, and why should programmers have to work harder to get a natural and safe abstraction?

"who says the compiler can't optimize out the arbitrary precision library calls in cases where the values are known at compile time just because the library isn't an official part of the language? The calls are likely to be short enough to be inlined and then if all the values are static the optimizer has a good shot at figuring it out."

Can you name a C++ compiler that does this?

"String literals as const char arrays are mostly harmless because the compiler ensures that they're "right" -- if they're const then you can't accidentally go writing past the end because you can't accidentally write at all"

You can accidentally read past the end, and you can accidentally print what you read. That can cause a lot of problems. There is no requirement that people use the standard library to iterate through a string.

"Moreover, bounds checking is pretty easy if you want it"

Once again, the programmer has to do extra work just to get something safe, because the default semantics are unsafe. If bounds checking is so easy, why not make it the default, and have unchecked access be an option for cases where speed matters? You already have at() and operator[] -- all that is needed is to switch which one of those does the bounds check.

AnthonyMouse · on Dec 27, 2012

>That sounds like the sort of memory management problem that smart pointers are supposed to solve.

Using a smart pointer would solve that specific problem, but then you're de facto mandating the use of smart pointers in every code block that an exception could be thrown through, which is every code block.

And what if mutex_wrapper is a smart pointer class? Do I now need to use a smart pointer to implement my smart pointer? Turtles all the way down? Or take operator new, which isn't an object at all so doesn't have a destructor, but still has to deallocate the memory it allocated if the constructor it calls throws, so it would need a "special" smart pointer object to use internally that only deallocates but doesn't call the destructor. It's not just calling destructors -- it's any cleanup operations because you can't do any destructive cleanup from a catch block anymore. So you end up requiring atomic RAII, and then none of the code actually implementing RAII can safely throw. It feels like just moving the problem: Now instead of not being able to throw from a destructor, you can't throw from code between resource allocation and turning on the destructor. Which is very close to saying you can't throw through a constructor.

>Really? It sounds like the second option would be what you would want: if the stack was unwound implicitly after the body of the catch block had executed, and unwinding the stack caused an exception to be thrown, then the exception was thrown out of the catch block. How is that unpleasant?

  void foo()
  {
	std::vector<destructor_always_throws> v;
	v.reserve(1000);
	for(size_t i = 0; i < 1000; ++i)
		v.emplace_back(destructor_always_throws());
  }

Now I've got a thousand destructors that are going to throw one after the other as soon as the function returns. The nearest catch block will catch the first one, then go back to unwinding the stack where the next one throws. The next nearest catch block will catch that one and then go back to unwinding the stack again. Pretty sure I'm going to run out of catch blocks eventually, but I'd like to be able to catch all the destructor exceptions somehow and not end up with program termination, since that was supposed to be the whole idea.

>Except that a failure to initialize should at least be reported to the user.

Which it is:

  $ ./a.out
  terminate called after throwing an instance of 'std::bad_alloc'
    what():  std::bad_alloc

Granted the message could be less cryptic, but we already know how to fix it. Don't call constructors that throw from global scope. It's pretty much the same thing as saying don't throw out of main(), because you get the same thing.

>In other words, all classes should provide a non-throwing constructor and an initialization routine, because any object might be constructed in the global scope.

Not any object, just the ones that have already been used that way thoughtlessly and need to be fixed without impacting existing code using the object.

If you just need a global in new code (and you're insistent on a global), make it a pointer or smart pointer and then use operator new in main to initialize it.

>If the application is reading one of the operands from the user, that overflow could be a security problem -- such problems are frequently reported.

All user input needs to be validated. The language doesn't matter. If you're using a language that provides arbitrary precision, the user can instead provide you with a number which is a hundred gigabytes long and will take a hundred years for your computer to multiply. "If num > threshold then reject" is going to be necessary one way or another.

>The real question is, why should the default type be the least safe, and why should programmers have to work harder to get a natural and safe abstraction?

Because it's faster. The language is old, being fast was important then, and sometimes it still is today. But given that both types are available, isn't your complaint more with the textbooks that teach people to use the faster, less safe versions rather than the slower, safer versions by default? Or do you just not like that the trade off between runtime checking and performance is even allowed?

>You can accidentally read past the end, and you can accidentally print what you read. That can cause a lot of problems. There is no requirement that people use the standard library to iterate through a string.

There is no requirement that people don't write code that says "uid = 0" where it should say "uid == 0" either. People who write bad code write bad code. I understand that languages can and should help you avoid doing things like that, but at some point, when everybody says "don't do that" and you do it anyway, you get what you get. Most languages allow you to call C libraries from them, and if you call them wrong you get the same unsafe behavior. Does that make all those languages too unsafe to use too?

betterunix · on Dec 27, 2012

"a smart pointer would solve that specific problem, but then you're de facto mandating the use of smart pointers in every code block that an exception could be thrown through, which is every code block."

Which is already what the standard does to closures: you are basically forced to use smart pointers and capture by value if you are returning a closure from a function.

"Or take operator new, which isn't an object at all so doesn't have a destructor, but still has to deallocate the memory it allocated if the constructor it calls throws"

So have the catch set a flag, save a copy of the exception, and then after the catch block the flag is checked; if the flag is set, free the memory and throw the copied exception. Or just give programmers a way to explicitly unwind the stack at any point in a catch block. Or give programmers a Lisp-style "restarts" system, and create a standard restart for freeing resources, so that resources will only be freed if no recovery is possible (and so that constructor exceptions can be recovered without having to reallocate resources).

The difference is in what errors can be handled: handling constructor exceptions well would require a bit more care if the stack were not unwound until the catch executes, but right now destructor exceptions cannot be handled at all unless you are willing to have a program terminate (which is the default behavior).

"Pretty sure I'm going to run out of catch blocks eventually, but I'd like to be able to catch all the destructor exceptions somehow"

As opposed to the current situation, where your program would terminate without ever reaching a catch block? This sounds like another case where restarts would be handy: stack unwinding could set a restart (or the vector destructor, since the objects are being destroyed there), so that one catch block could keep invoking a restart and then handle each exception until no objects remain. I can imagine cases where it would be better to ignore some errors until a particular resource is freed than to have a program quit or to allow a low-priority error to prevent that resource from being freed.

So if your point is, "Catching before the stack has been unwound necessitates a restart system," I can agree to that, especially since catching before unwinding makes restarts possible.

"Granted the message could be less cryptic,"

It could also be completely useless. What if there is no terminal? What if the user is only presented with a GUI? The default exception handler has no way to know what sort of user interface a program will have, so it has no reliable way to present errors to users, let alone to allow users to correct errors.

"we already know how to fix it. Don't call constructors that throw from global scope."

Which is basically saying that all classes need a non-throwing constructor, or that you should never have a global object (not necessarily a bad idea, but people still create global objects sometimes). A better idea, which I think we agree on, would be to give programmers a way to handle exceptions outside of main.

"All user input needs to be validated"

OK, sure. Except that people do not always validate input, which is how we wind up with bugs. Input validation adds complexity to code, and like all things that involve extra programmer work, it is likely to be forgotten or done incorrectly somewhere.

"The language doesn't matter"

Sure it does, because the languages decides whether or not forgetting to validate some input will result in the program terminating (from an exception) or the program having a vulnerability (because it will use bad input). If there is no bounds checking on arrays, failing to validate input that is used as an array index is a vulnerability. If there is no error signalled when an integer overflows, failing to validate input that is used in integer arithmetic is a vulnerability.

I think experience has shown that it is easy for programmers to forget about validating and sanitizing input, and that it is easy for programmers to validate or sanitize input incorrectly. SQL injection attacks can be prevented by either (a) sanitizing all inputs or (b) using parameterized queries; it is hard to argue that (a) is a superior solution to (b), because there are fewer things to forget or get wrong with (b).

"the user can instead provide you with a number which is a hundred gigabytes long and will take a hundred years for your computer to multiply"

Sure, and then they can trigger a denial of service attack. And then the admin will see that something strange is happening, kill the process, and take some appropriate action, and that will be that. Denial of service attacks are a problem, sure, but it is almost always worse for a program to leak secret data or to give an untrusted user the ability to execute arbitrary commands -- especially when the user might do so without immediately alerting the system administrator to the problem (which spinning the CPU will probably do). It is also worth noting that a vulnerability that allows a remote attacker to run arbitrary code on a machine gives the attack the ability to run a denial of service attack (the attacker could just use their escalated privileges to spin the CPU); denial of service does not imply the ability to control a machine or read (or modify) sensitive information.

"isn't your complaint more with the textbooks that teach people to use the faster, less safe versions rather than the slower, safer versions by default?"

It's not just the textbooks; it is the language that encourages this. It is harder to use arbitrary precision types than to use fixed-width types in C++, because the default numeric types are all fixed width. That, in a nutshell, is the problem: C++ encourages programmers to write unsafe code by making safe code much harder to write. It is not just about numeric types: bounds checking is harder than unchecked array access, it is easier to use a primitive pointer type, it is easier to use a C-style cast, etc.

It would not have been hard to say that "int" is arbitrary precision, and to force programmers to use things like "uint64_t" if they want fixed-width (and perhaps to have a c_int type for programmers who need to deal with C functions that return the C "int" type). It would result in slower code for programmers who were not paying attention, but that is usually going to be better than unsafe code (can you name a case where speed is more important than correctness or safety?). Even something as simple as ensuring that any integer overflow causes an exception to be thrown unless the programmer explicitly disables that check (e.g. introduce an unsafe_arithmetic{} block to the language) would go a long way without forcing anyone to sacrifice speed.

"People who write bad code write bad code"

It's not just bad code; people can forget things when they are on a tight deadline. It happens, and languages should be designed with that in mind.

"at some point, when everybody says "don't do that" and you do it anyway, you get what you get"

I think there is a lesson from C++98 that is relevant here. Everyone says not to use C-style casts, and to use static_cast, dynamic_cast, or const_cast instead (or reinterpret_cast if for some reason it makes sense to do so), yet you still see people using C-style casts. It is just less difficult to use a C-style cast: less typing, less mental effort (there is no need to choose the correct cast), fewer confusing messages from the compiler, etc. Likewise, people continue to use operator[] in lieu of at()/iterators, because it is less effort.

Blaming programmers for writing bad code when that is the easiest thing for them to do is the wrong approach (but unfortunately, it is an approach that seems common among C and C++ programmers). The right approach is to make writing bad code harder, and to make writing good code easier.

"Most languages allow you to call C libraries from them, and if you call them wrong you get the same unsafe behavior. Does that make all those languages too unsafe to use too?"

No, because most languages with an FFI do not require you to use the FFI, nor do they encourage you to do so. The two FFI's I am most familiar with are JNI and SBCL's FFI, and both of those require some effort to use at all. One cannot carelessly invoke an FFI in most languages; usually, a programmer must be very explicit about using the FFI, and it is often the case that special functions are needed to deal with unsafe C types. You could be a fairly successful Java programmer without ever touching the FFI; likewise with Lisp, Haskell, Python, and other high-level languages.

I am actually a big fan of languages with FFIs, because sometimes high-level programs must do low-level things, but most of the time a high-level program is only doing high-level things. FFIs help to isolate low-level code, making it easier to debug low-level problems and allowing programmers to work on high-level logic without having to worry about low-level issues.

It is worth noting that there is nothing special about C. You can take high-level languages and retool their FFIs for some other low-level language, and the FFIs would be just as useful. You usually see C because most OSes expose a C API, and so low-level code for those systems is usually written in C. Were you to use Haskell on a system that exposes a SPARK API, you would want an FFI for SPARK, and your FFI would be less of a liability (since SPARK is a much safer language than C). So no, I do not think you can argue that having an FFI that allows code written in an unsafe language makes a high-level language unsafe; if a high-level program is unsafe because of its use of C via an FFI, the problem is still C (and the problem is still solved by either not using C or by only using a well-defined subset of C).

AnthonyMouse · on Dec 28, 2012

>Which is already what the standard does to closures: you are basically forced to use smart pointers and capture by value if you are returning a closure from a function.

Well, you have to make sure somehow that the thing a pointer is pointing to will still be there when the closure gets executed, sure. But the change your asking for would have a much wider impact. Anywhere you have a dynamically allocated pointer, or really anything that needs destruction whatsoever, without having an already-associated destructor would become unsafe for exceptions. Which is commonly the case in constructors. It's basically this pattern (which is extremely common) that would become prohibited:

  class foo
  {
	foo() {
		do_X();
		try {
			may_throw_exception();
		} catch(…) {
			undo_X();
			throw;
		}
	}
	~foo() { undo_X(); }
  };

What you would need to do is exclude exceptions from passing through any constructor that has an associated destructor, because the destructor wouldn't be called if the constructor throws and the catch block couldn't safely destroy the resources before the stack is unwound.

>So have the catch set a flag, save a copy of the exception, and then after the catch block the flag is checked; if the flag is set, free the memory and throw the copied exception.

Obviously it can be worked around, but can you see how quickly it becomes a headache? And now you're adding code and complexity to operator new, which is a good candidate for the most frequently called function in any given program.

>Or just give programmers a way to explicitly unwind the stack at any point in a catch block.

In other words, the existing functionality is good and necessary for some circumstances, but you want something different in addition to it.

It seems like you're looking for something like this:

  void foo::bar()
  {
	connect_network_drives();
	try {
		do_some_stuff();
	} catch(file_write_exception) {
		check_and_reconnect_network_drives();
		resume;
	}  catch(fatal_file_write_exception) {
		// epic fail, maybe network is dead
		// handle serious error, maybe terminate etc.
	}
  }
  void do_some_stuff()
  {
	// …
	file.write(stuff);
	if(file.write_failed()) {
		throw file_write_exception();
		on resume {
			file.write_stuff(stuff); // try again
			// no resume this time if error not fixed
			if(file.write_failed())
				throw fatal_file_write_exception(); 
		}
	}
  }

But if that's what you want, why do you need special language support, instead of just doing something like this?

  // (this class could be a template for multiple different kinds of errors)
  class file_write_error
  {
  private:
	static thread_local std::vector< std::function<void()> > handlers;
  public:
	file_write_error(std::function<void()>& handler) {
		handlers.push_back(handler);
	}
	~file_write_error() {
		handlers.pop_back();
	}
	static void occurred() {
		// execute most recently registered handler
		if(handlers.size() > 0)
			handlers.back()();
	}
  };
  // (and then poison operator new for file_write_error
  // so it can only be allocated on the stack
  // and destructors run in reverse construction order)
  void foo::bar()
  {
	connect_network_drives();
	try {
		file_write_error handler([this]() {
			check_and_reconnect_network_drives();
		});
		do_some_stuff();
	} catch(fatal_file_write_exception) {
		// epic fail, maybe network is dead
		// handle serious error, maybe terminate etc.
	}
  }
  void do_some_stuff()
  {
	// …
	file.write(stuff);
	if(file.write_failed()) {
		file_write_error::occurred(); // handle error
		file.write_stuff(stuff); // try again
		if(file.write_failed()) 
			throw fatal_file_write_exception();
	}
  }

It seems like you're just looking for an error callback that gets called to try to fix a problem before throwing a fatal stack-unwinding exception is necessary. And it's not a bad idea, maybe more people should do that. But doesn't the language already provides what is necessary to accomplish that? Are we just arguing about syntax?

You could easily use that sort of error handler in a destructor as an alternative to exceptions as it is now. There is nothing the error handler can't do that a catch block could before stack unwinding. And if you call such a thing and it fails, the two alternatives of either ignoring the error or terminating the program are all you really have left anyway, because if there was anything else to do then you could have either done it in the destructor or in the error handler. (I can imagine that an error handler may benefit from being able to call the next one up in the hierarchy if any, analogous to 'throw' from a catch block, but that could be accomplished with minimal changes to the above.)

betterunix · on Dec 28, 2012

"In other words, the existing functionality is good and necessary for some circumstances, but you want something different in addition to it."

The problem with the existing approach is that the safety of throwing an exception from a destructor depends on the context in which the destructor is invoked, and there is no workaround for it. What I am proposing would ensure that the safety of throwing an exception would not be dependent on why the function throwing the exception was called; edge cases where this might be unsafe could either be worked around (perhaps in a headache-inducing way, but nothing close the headaches associated with destructors having no way to signal errors), or a language feature could be added to solve those problems.

"It seems like you're just looking for an error callback that gets called to try to fix a problem before throwing a fatal stack-unwinding exception is necessary. And it's not a bad idea, maybe more people should do that. But doesn't the language already provides what is necessary to accomplish that? Are we just arguing about syntax?"

Well, if we are arguing about programming languages, what is wrong with arguing about syntax? The language does provide enough features to manually implement restarts -- but if that is your argument, why do you even bother with C++? C gives you everything you need to implement any C++ feature; assembly language gives you all you need to implement any C feature. We use high-level languages because our productivity is greatly enhanced by having things automated.

Take exception handling itself as an example. We do not actually need the compiler to set it up for us -- we already have setjmp/longjmp, which are enough to manually create an exception handling system. The problem is that the programmer would be responsible to setting up everything related to exceptions -- stack unwinding, catching exceptions, etc. Nobody complains about exceptions being a language feature instead of something programmers implement by hand -- so why not add another useful language feature?

Rather than callbacks, what you really want for restarts is continuations. One way a Lisp compiler might implement the Lisp "conditions" system would be something like this: convert the program to continuation passing style; each function takes two continuations, one that returns from the function (normal return) and one that is invoked when an exception is raised. Each function passes an exception handler continuation to the functions it calls; when a function declares an exception handler, it modifies the exception handler continuation to include its handler (the continuation would need to distinguish between exception types; this can be done any number of ways), which would include the previous exception handler continuation (so that exceptions can be propagated to higher levels). Exception handler continuations will take two continuations: one to unwind the stack (which is used to "return" from the handler), and the restarts continuation that is used for invoking restarts. When an exception is thrown, the thrower passes a restart continuation (or a continuation that throws some exception if no restarts are available), and then the handler continuation will either invoke the appropriate handler or it will invoke the handler continuation from the next higher level.

Complicated? Yes, and what I described above is just the simplified version. It should, however, be done automatically by the compiler, and the programmer should not even be aware that those extra continuation arguments are being inserted. The stack unwinding continuation could potentially be exposed to the programmer, and for convenience it could be set up to take a continuation as an argument -- either the rest of the handler, or else the "return from handler" continuation that exits the handler, so that the programmer could perform some error handling after the stack is unwound (e.g. the clean-up code), although this could potentially be accomplished using restarts (but that might be less "pretty").

Perhaps continuations should be suggested for C++14; it is an almost logical followup to the introduction to closures in C++11.