A fair observation. In the example of handling HTML input, I would suppose that's not a problem with individual developers, but a problem with the industry. Such a relaxed format should not have been allowed to exist, if the industry cared about its software products being as robust as possible.
I'm failing to think of any avionics application that might handle HTML, but avionics systems do have their own formats to deal with. ARINC 661, for example, is an XML file format for transmitting graphical display elements:
Of course, all uses of ARINC 661 data are thoroughly tested. I'm not sure I would go so far as to describe it as a "toy problem", but it certainly does intentionally limit the problem domain to exactly what needs to be dealt with. Malformed ARINC 661 data received would just be discarded, not tried to be displayed in the best possible way even if it wasn't quite right, because that would be unacceptable; the problem would be with whoever was sending the malformed data.
Anyway, you're quite right though; without a precise and unambiguous format definition, you can only go so far down the path of robustness.
I'm failing to think of any avionics application that might handle HTML, but avionics systems do have their own formats to deal with. ARINC 661, for example, is an XML file format for transmitting graphical display elements:
http://en.wikipedia.org/wiki/ARINC_661
Of course, all uses of ARINC 661 data are thoroughly tested. I'm not sure I would go so far as to describe it as a "toy problem", but it certainly does intentionally limit the problem domain to exactly what needs to be dealt with. Malformed ARINC 661 data received would just be discarded, not tried to be displayed in the best possible way even if it wasn't quite right, because that would be unacceptable; the problem would be with whoever was sending the malformed data.
Anyway, you're quite right though; without a precise and unambiguous format definition, you can only go so far down the path of robustness.