Pure Vision currently. The last we heard about LiDAR was the issue of combining Lidar Data with Vision in the NN. Which technology should take priority in the NN? For example LiDAR might recognise the shape of a sign but vision will know if its a stop sign or not.
I'd be super interested to see if someone has successfully combined RGB data with LIDAR data.
This is basic university robotics. Sensor fusion ! There are a whole host of techniques for dynamically updating confidence between two or more sensors estimating the same values. Kalman filter being the standard approach (which for reference was used in the apollo missions 50 years ago - developed a lot since then)
And yes, it is commonly applied with vision models. There are a host of combined rgb/lidar, structured light, depth camera and more setups in the labs students are working on at my local university, and have been for at least 6 years
For reference, I, a computer science undergrad, learned this kind of sensor fusion theory in an elective class that was just an excuse for a soon to retire professor to play with lego robots
You can know more about sensor fusion than Elon does by reading a literal blog post.
The NN shouldn't have to "choose" one or the other. It's a classic "Not even wrong" question!
I'd be super interested to see if someone has successfully combined RGB data with LIDAR data.