Protecting against bad actors and/or assuming model outputs can/will always be filtered/policed isn't always going to be possible. Self-driving cars and autonomous robots are a case in point. How do you harden a pedestrian or cyclist against the possibility or being hit by a driverless car, or when real-time control is called for, how much filtering can you do (and how mush use would it be anyway when the filter is likely less capable than the system it meant to be policing).
The latest v12 of Tesla's self-driving is now apparently using neural-nets for driving the car (i.e. decision making) - had been hard-coded C++ up to v.11 - as well as for the vision system. Presumably the nets have been trained to make life or death decisions based on Tesla/human values we are not privy to (given choice of driving into large tree, or cyclist, or group of school kids, which do you do?), which is a problem in of itself, but who knows how the resulting system will behave in situations it was not trained on.
> given choice of driving into large tree, or cyclist, or group of school kids, which do you do?
None of the above. Keep the wheel straight for maximum traction and brake as hard as possible. Fancy last-second maneuvering just wastes traction you could have spent braking.
Well, who knows how they've chosen to train it, or what the failure modes of that training are ...
If there are no good choices as to what to hit, then hard braking does seem to be generally a good idea (although there may be exceptions), but at the same time a human is likely to also try to steer - I think most people would, perhaps subconsciously, steer to avoid a human even if that meant hitting a tree, but probably the opposite if it was, say, a deer.
The latest v12 of Tesla's self-driving is now apparently using neural-nets for driving the car (i.e. decision making) - had been hard-coded C++ up to v.11 - as well as for the vision system. Presumably the nets have been trained to make life or death decisions based on Tesla/human values we are not privy to (given choice of driving into large tree, or cyclist, or group of school kids, which do you do?), which is a problem in of itself, but who knows how the resulting system will behave in situations it was not trained on.