Apparently they have been surprised at how few photons are required to see for these sensors. They are skipping the image computer vision step and going from photons to car control in as few layers as possible.
It's not an event camera, so it's very much taking images, which are then being processed by computer vision algorithms.
Event cameras seem more viable than CMOS sensors for autonomous vehicle applications in the absence of LIDAR. CMOS dynamic range and response isn't as good as the human eye. LIDAR+CMOS is considerably better in many ways.
Next time you’re facing blinding direct sunlight, pull out your iPhone and take a picture/video. It’s a piece of cake for it. And it has to do far more post processing to make a compelling jpeg/heic for humans. Tesla can just dump the data from the sensor from short&long exposures straight into the neural net.
Humans can also decide they want to get a better look at something and move their head (or block out the sun with their hands) which cameras generally don't do.