The similarity is in the problem to be solved, not the details of the compute pipeline. Depth must be inferred somehow, rather than measured by actively interacting with the surface, as it is in LiDAR.
If you really wanted to make this argument, you wouldn’t even want to bother inferring depth, since that’s not what humans do, not directly at least. If you’re actually trying to obtain a depth map as part of your pipeline, LIDAR (or LiDAR + vision feeding into a denser depth prediction model) would always be a better strategy, cost aside, since determining depth from images is an ill posed problem.
My claim is that humans use their eyes as a primary input for driving. I don't think it's controversial. We don't let eyeless people drive. Eyes do not shoot out lasers.