The argument reasoning I've heard goes like this; People drive reasonably well u...

argonaut · on Nov 13, 2016

Driving has almost nothing to do with image classification. *

Humans implicitly perform SLAM (simulataneous localization and mapping). What do I mean? Look around your room. Close your eyes. Visualize the room. As a human, you've built a rough 3D model of the room. And if you keep your eyes open and walk through the room, that map is pretty fine-grained/detailed too and humans can keep track of where they are in the map.

The state of the art in visual SLAM (visual SLAM = SLAM from just images, nothing else) is not deep learning. It's actually still linear-algebra/geometric/keyframe based traditional computer vision (including variants that incorporate GPS/accelerometer info). There are all sorts of limitations, but the biggest is the current algos don't work when the environment is moving (!!!).

SLAM from LIDAR is solved. That's why people use LIDAR.

You might argue, that perfect SLAM is overkill for driving. And I agree. Humans rely on being able to do lots of things that are "theoretically overkill" for any given task - and maybe that's exactly why, so far, humans can drive and computer can't.

* It bears noting that even in domains like image segmentation, humans still do better than neural nets. (Group pixels in an image into different categories - this is still a caricature of "vision," but still far more representative of real vision than simply giving a global label to an image).

2bitencryption · on Nov 13, 2016

Let me make sure I get this:

When a Tesla (or other non-LIDAR) vehicle is driving, it is not continuously building a 3D model of its environment. Instead, it is matching patterns on the road, and "understanding" based off what it sees in an otherwise flat image.

Whereas LIDAR vehicles use the LIDAR technology to develop a map of the world around them, for additional understanding?

argonaut · on Nov 14, 2016

It's unclear if Tesla is building a 3D model or not. For the cameras, at least looking at Mobileeye's demos, it appears not. But Tesla also is using radar - it's unclear whether they're using the radar for just collision detection, or for building a full 3D map.

LIDAR gives you a 3D map of the surroundings, yes.

unfamiliar · on Nov 13, 2016

No. They're both building a map and trying to place themselves in it. Using LIDAR makes this much easier to do.

Animats · on Nov 13, 2016

It's not clear whether Tesla has a local map. They've never shown pictures of one, unlike Google. The Mobileye unit definitely does not; you can buy a Mobileye aftermarket unit and watch it put rectangles around things that look like cars and people. Tesla's radar probably just returns targets and range, like most first-generation automotive radars. Tesla's system may be entirely reactive.

Google builds maps; they do path planning, so they have to.

Cybiote · on Nov 12, 2016

> People drive reasonably well using vision primarily

This is not accurate however. Other important senses in use include proprioceptive, hearing and tactile feedback from wheels. In addition to vision and the improved dynamic range of eyes, there is the important fact that human vision integrates a world model into expectations. Human vision also models time and motion which help manage where to focus attention. Humans can additionally predict other agents and other things about the world based on intuitive physics. This is why they can get on without the huge array of sensors and cars cannot. Humans make up for the lack of sensors by being able to use the poor quality data more effectively.

To put this in perspective, 8.75 megabits / second is estimated to pass through the human retina but only on the order of a 100 bits is estimated to reach conscious attention.

> Computer learning networks can classify imagery at least as accurately as humans and sometimes more so.

This is true but only in a limited sense. For example, when I put in the image on the right (of a car in a swimming pool) from http://icml.cc/2015/invited/LeonBottouICML2015.pdf#page=58 (which you should read and find the talk of but) in ResNet I get as top results:

0.2947; screen, CRT screen

golfcart, golf cart

boathouse

amphibian, amphibious vehicle

For LeNet it's:

0.5422; amphibian, amphibious vehicle

jeep, landrover

wreck

speedboat

The key difference is learning in animals occurs by breaking things down in terms of modular concepts, so even when things are not recognized new things can be labeled as a composition of smaller nearby concepts. Machines cannot yet do this well at all and certainly not as flexibly. Things as lighting and shading do not move animals as much in the concept space.

> The execution strategy appears to be to run classification and command prediction all the time, and while the human is in control consider it supervised learning.

This strategy will not learn from accidents because the signal there will be far from optimal usually.

pakl · on Nov 13, 2016

I commend you. Very few people actually try feeding images into these things to show how bad they are.

For a real shock at how poor performance is, try feeding frames of video in. (Video is different because it generally doesn't have carefully stereotypically framed, exposed, in-focus content).

Edit: examples of ResNet applied to video by a colleague http://blog.piekniewski.info/2016/08/12/how-close-are-we-to-...

Capt-RogerOver · on Nov 13, 2016

Very informative quote:

So where do the reports of superhuman abilities come from? Well since there are many breeds of dogs in the ImageNet, an average human (like me) will not be able to distinguish half of them (say Staffordshire bullterrier from Irish terrier or English foxhound - yes there are real categories in ImageNet, believe it or not). The network which was "trained to death" on this dataset will obviously be better at that aspect. In all practical aspects an average human (even a child) is orders of magnitude better at understanding/describing scenes than the best deep nets (as of late 2015) trained on ImageNet.

enraged_camel · on Nov 13, 2016

>>This is not accurate however. Other important senses in use include proprioceptive, hearing and tactile feedback from wheels. In addition to vision and the improved dynamic range of eyes, there is the important fact that human vision integrates a world model into expectations. Human vision also models time and motion which help manage where to focus attention. Humans can additionally predict other agents and other things about the world based on intuitive physics. This is why they can get on without the huge array of sensors and cars cannot. Humans make up for the lack of sensors by being able to use the poor quality data more effectively.

Yes, but all those things you listed are very imperfect and can increase the risk of mistakes. For example, someone two lanes over honking their horn can cause a momentary distraction for you ("are they honking at me?"), causing you to not notice a cyclist cutting in front of you. And so can a dancing clown on the sidewalk.

Cybiote · on Nov 13, 2016

Very much so and that's why we don't want our cars to perfectly replicate humans. More sensors plus a limited and narrow AI is better than just vision and a smarter AI.

viraptor · on Nov 13, 2016

> This is true but only in a limited sense.

Isn't what we need very limited anyway? What the cars need is recognition of obstacles and the type of the obstacle in a very limited range. Basically when something is on the road, it doesn't matter whether it's a moose or a deer - you slow down and avoid, or brake depending on the environment.

"what is in this picture" classifiers don't seem like a good algorithm to use in that case. Object detection / feature extraction seems to be much closer.

kpil · on Nov 13, 2016

Eh. Is it clouds or a truck, or a shadow on the road or a moose? A tree besides the road or a cyclist) Hard to tell without really good classifiers.

I guess that with a lidar or some kind of 3d perception, you can relax the demands on the classifier a bit and ignore really flat things and the sky...

argonaut · on Nov 13, 2016

Object detection is considered a more difficult problem than image classification in the vision community. So I'm not sure what the point is there.

dave_sullivan · on Nov 12, 2016

> The key difference is learning in animals occurs by breaking things down in terms of modular concepts, so even when things are not recognized new things can be labeled as a composition of smaller nearby concepts. Machines cannot yet do this well at all and certainly not as flexibly.

Actually, that's pretty much what deep learning is doing. For instance: https://papers.nips.cc/paper/5027-zero-shot-learning-through...

That paper was from a few years ago, I think the state of the art is better now, but it's trying to do exactly what you're talking about. More broadly, what you're talking about falls under the umbrella of transfer learning (that is, a model's ability to learn helpful information about task Y by training on related task X, preferably by learning and sharing useful features.)

Cybiote · on Nov 13, 2016

I'm talking about learning modularly. Children can categorize things they've never seen before by inventing labels on the spot, they're not limited to selecting from a preexisting set (e.g. a lion is a "big cat"). They can recognize novelty and ask questions if nothing they know quites fit.

As a human, you are able to learn the general concept of leg and understand it, even if it is in a context you have never seen before and an object you've never seen or never seen used in that way before. Everything you learn, is also as part of a set of relations. Each of individual concept modified in a precise manner as you learn something new about any of them. A big part of human intelligence, from the simple naming of things, to the highest levels of science is taking parts of things you know and putting them together in novel ways.

In Neural nets this: https://arxiv.org/abs/1511.02799 is in line of what I mean.

Hydraulix989 · on Nov 12, 2016

Well, not with traditional feedforward networks (LeNet, etc.). You can't run the classifier and find tires, then wheels, and then a car; but you do get composition of features.

dave_sullivan · on Nov 13, 2016

> not with traditional feedforward networks (LeNet, etc.)

I'd argue they are implicitly doing this.

> You can't run the classifier and find tires, then wheels, and then a car;

Why can't you run a classifier for tires, one for wheels, one for cars, then combine their outputs for a final classifier maybe based on a decision tree? You can train all the networks at the same time and it will give you a probability distributions for all 4 outputs (tires, wheels, cars, blended). What am I missing?

argonaut · on Nov 13, 2016

That would just be your opinion. It has not been shown. It's still an open research question over what neural nets are actually learning in their intermediate layers.

You're going to need large amounts of fine-grained labeled data for each category. You've also just manually determined some sort of (brittle) object ontology. What if there are only 3 tires? What if there are four tires on the road but no car? All sorts of edge cases, and all you've done is train a classifier for cars, not actually solved driving in any meaningful way.

Hydraulix989 · on Nov 13, 2016

Doesn't scale. You don't have N brains to compose every representation.

pakl · on Nov 13, 2016

Agreed -- If you get away from traditional feedforward networks by adding recurrence throughout, then at least there is some chance of learning scale-free features and compositionality.

mattnewton · on Nov 12, 2016

Why not shoot for better than human performance and "cheat" any way possible along the way? To paraphrase a quote I can't remember by who, do we care if a submarine "swims"?

Besides, with even with lidar, the problem is hard enough.

ChuckMcM · on Nov 12, 2016

I agree, being better than human is a sales point that I expect to see in brochures. One of the ways I would expect that plays out is self driving transport cars for high value targets like world leaders and drug lords. "This car will respond faster, and more accurately, to get you to safety before a human driver even knew there was a problem."

That said, John stated that without LIDAR you couldn't adequately meet the environmental challenges and achieve good self driving.

Specifically "I'm still not happy with self-driving on vision alone ... There are too many hard cases for vision." which boils down to a disbelief on the imaging processing side of the pipeline where NVidia has been attacking using GPU type architectures to extract image information rather than generate it.

One way to evaluate how far the image processing pipeline has come is to look at research on how well it can classify images. And in that space, in the research, it is doing better than humans [1]. As I've said elsewhere I think LIDAR was a crutch that worked well to cover for weaknesses in classifying images, but I recognize that the crutch may no longer be needed (certainly Tesla and Nvidia are trying to make that case).

[1] http://www.eetimes.com/document.asp?doc_id=1325712

argonaut · on Nov 13, 2016

There is only a little evidence current image classification models outperform humans. The 5.1% number is just the number from one grad student who went through some ImageNet images himself - e.g. a single blog post (http://karpathy.github.io/2014/09/02/what-i-learned-from-com...). There really hasn't been a concerted effort to see how humans actually do on ImageNet.

I am willing to believe that a group of humans trained on ImageNet and without time constraints would be able to outperform the state of the art neural net models.

visarga · on Nov 12, 2016

> One way to evaluate how far the image processing pipeline has come is to look at research on how well it can classify images. And in that space, in the research, it is doing better than humans.

Well, computer vision is doing better than humans on the ImageNet dataset, that does not mean it is better than human on a driving image dataset.

vonmoltke · on Nov 13, 2016

> One way to evaluate how far the image processing pipeline has come is to look at research on how well it can classify images. And in that space, in the research, it is doing better than humans [1]. As I've said elsewhere I think LIDAR was a crutch that worked well to cover for weaknesses in classifying images, but I recognize that the crutch may no longer be needed (certainly Tesla and Nvidia are trying to make that case).

I have worked on systems for automatic target detection and classification. Systems have a long way to go before they reach human-level accuracy in classification. Even on "artificial" images like radar return maps humans are still slightly better, and many systems pass the underlying processed data to the human operator for final review. Tracking and classification in a dynamic environment is hard. It's even harder when trying to rely on passive sensors to do it.

tedunangst · on Nov 12, 2016

ML seems pretty bad at classifying things it hasn't seen before though. There are quite a few examples where an input outside the training data resulted in misclassification.

Humans may not always see a white truck in a snowstorm, but is computer vision going to see it either? Or will it pattern match the few visible parts as something else entirely? Or dismiss the truck entirely as noise?

ChuckMcM · on Nov 12, 2016

I don't disagree, both humans and ML are bad at classifying things they haven't seen before[1]. However that reasoning doesn't disqualify either vision only auto driving systems or machine learning.

Both statements are true:

"Computer driven cars may crash, even fatally, when they encounter a situation that they do not recognize." and

"People driving cars may crash, even fatally, when they encounter a situation that they do not recognize."

The success criteria for self driving cars is that they can drive at least as well, in the common case, as the set of human drivers who are defined to be "good" drivers. And self driving is not invalidated by a computer's mishandling of an event that a good driver would also mishandle.

I expect that self driving systems will be differentiated by how well they handle the unusual cases so a Mercedes system might do better in an unusual situation than a Chevy system. And all of this discussion is orthogonal to LIDAR :-).

[1] http://puzzlephotos.blogspot.com/

visarga · on Nov 12, 2016

There is a difference though - humans understand the surrounding state, computer vision is not quite there. It can recognize things, and in NVIDIA's case directly generates steering commands without going through the intermediate step of building a model.

Humans build models of the world, and such models allow us to predict the future to a little extent, and explain the reasons behind a situation. Humans can intuit the intentions of other drivers and the behavior of other objects. AI can't do that quite as well.

philjohn · on Nov 12, 2016

Also, making eye contact and being waved through. Humans are excellent at reading cues such as this.

pakl · on Nov 13, 2016

You've hit on a key insight. Predicting the future (even by a bit) turns out to be a very powerful learning signal for building models of the world.

It won't work on a traditional feedforward neural network but if you have feedback everywhere it appears to work.

wool_gather · on Nov 13, 2016

> The success criteria for self driving cars is that they can drive at least as well, in the common case, as the set of human drivers who are defined to be "good" drivers. And self driving is not invalidated by a computer's mishandling of an event that a good driver would also mishandle.

This comes with one important caveat: these are the engineering criteria. The criteria of public perception, unfortunately, may not allow for a computer driver that makes the same mistakes that a human "would have", because people tend to mis-estimate what they or another human "would have" done.

vidarh · on Nov 12, 2016

But the computer vision systems can be endlessly improved and merge experience from millions of cars, while human drivers accumulate experience from a single driver, age, and are eventually replaced by younger, inexperienced drivers.

Soon enough these systems will have data from encounters with far more varied situations than any single human will ever be physically able to encounter in a lifetime.

jaffa214525 · on Nov 12, 2016

Not sure I see how that would work when there is no 3G signal. If a computer on-board a vehicle sees something it does not recognize when it's not connected to the Tesla HQ, what should it do? And even if it is connected, uploading video over 3G is too slow for the real-time classification needs. Right?

rincebrain · on Nov 12, 2016

The notion isn't that they phone home to immediately ask about unknown data, but that they periodically feed back any unknown data (or, in the worst case, it gets extracted from a blackbox if a car has an accident), and receive revised models to process the world with.

No, this won't necessarily save you if you see some impossible scenario like a boat cruising down the highway toward you, but it does mean that the list of conditions the model can't respond to reasonably will rapidly trend toward "fewer than most human drivers", in the ideal case.

This will, invariably, have some unfortunate bumps when people discover real-world conditions that, for whatever reason, the model doesn't remotely have responses for (I wonder if they've trained it on e.g. an enormous wall of water, or tornadoes?), but that's why you don't claim it's an always-on self-driving system (e.g. you have to be ready to take over at any point), and arguably the error rate is still going to be lower than most humans to start with.

argonaut · on Nov 13, 2016

You need humans to label those "millions of experiences." The bottleneck is not raw video. You need humans to label that data. Otherwise it's useless.

vidarh · on Nov 13, 2016

No, you don't. You need to process enough of it to see how the majority of human drivers act in situations where the automated system currently would react substantially differently.

argonaut · on Nov 14, 2016

You've just described end-to-end-learning (e.g. the data is raw video/camera/radar/sensor data, the labels are steering angles, etc.)

No autonomous vehicle manufacturer uses end-to-end learning. The only one to claim to use it was Comma.ai, and we all know how that went.

All the autonomous car companies will manually label the camera images - e.g. given an image, draw boxes around where all the cars are.

vidarh · on Nov 15, 2016

Learning how to label camera images and how to respond to already detected features are separate issues. You can make the choice of whether to apply unsupervised training to either one separately.

This [1] blog post from Tesla explicitly claim that they are using unsupervised learning as one of their strategies to determine appropriate system response to specific detected objects in specific locations:

> This is where fleet learning comes in handy. Initially, the vehicle fleet will take no action except to note the position of road signs, bridges and other stationary objects, mapping the world according to radar. The car computer will then silently compare when it would have braked to the driver action and upload that to the Tesla database. If several cars drive safely past a given radar object, whether Autopilot is turned on or off, then that object is added to the geocoded whitelist.

> When the data shows that false braking events would be rare, the car will begin mild braking using radar, even if the camera doesn't notice the object ahead. As the system confidence level rises, the braking force will gradually increase to full strength when it is approximately 99.99% certain of a collision. This may not always prevent a collision entirely, but the impact speed will be dramatically reduced to the point where there are unlikely to be serious injuries to the vehicle occupants.

[1] https://www.tesla.com/blog/upgrading-autopilot-seeing-world-...

Houshalter · on Nov 12, 2016

Classification is not the right metric to use here. Lidar doesn't classify the objects it's looking it, it just tells you the direction and distance.

Cameras can also gauge distance pretty effectively from parallax. Either using multiple cameras, or from the motion of the vehicle itself, or both. From this it should be possible to gauge where obstacles are and drive safely.

But NNs give the possibility of gathering much more information from recognizing objects. Information that Lidar systems don't have.

matt_kantor · on Nov 12, 2016

> But NNs give the possibility of gathering much more information from recognizing objects. Information that Lidar systems don't have.

Wouldn't you feed both the depth map from the lidar and imagery from the cameras into the neural network? I imagine that a variety of different sensors as input would make it easier to do classification. As an analogy, someone who has lost their sense of smell might have a harder time telling the difference between a clean sock and a dirty sock than I would.

Please let me know if I'm wrong here, but I assume that the depth information that can be derived from parallax is not a superset of what you get from lidar (I'm thinking about low light, glare, objects with complicated geometries, similar-colored objects obscuring each other, etc).

Houshalter · on Nov 13, 2016

You could do that. The advantage of using purely cameras is they are cheaper and simpler, and don't stop functioning during bad weather.

ebalit · on Nov 12, 2016

You can use a model that gives you its classification uncertainty. Bayesian SegNet for example [1]. We may also adapt the legislation for how vehicles should look like we did to make human driving easier (ex: tail and side lights).

1: https://arxiv.org/abs/1511.02680

visarga · on Nov 12, 2016

> Humans may not always see a white truck in a snowstorm, but is computer vision going to see it either?

So you put in your training and test dataset a bunch of such situations. At some point you've covered enough cases to extrapolate the rest.

Good testing is going to hunt for these blind spots and fix them. Fact is that it's already safer than humans, even with all its hidden imperfections.

argonaut · on Nov 13, 2016

What if that point is 20 years from now? What if every time Ford/GM/Toyota substantially changes the look of their cars, your classifier no longer recognizes them because all your data only has the old models in it. That's what people are driving at. Simply collecting more data is not enough to solve this problem.

georgespencer · on Nov 13, 2016

At a certain point it's just about recognising an object which shares broad characteristics with a car rather than aesthetics. Eg it moves at the speed a car moves at, it's in the road, it's overtaking on the right hand lane. I would expect any autonomous car to be able to fail over to "this object is likely a vehicle I haven't seen before" given a strange car-like object being detected.

argonaut · on Nov 13, 2016

Great. Now the problem you've posed is no longer image classification. It's more like video classification or zero-shot classification! (neither of which are close to solved)

georgespencer · on Nov 13, 2016

It doesn't seem like zero-shot classification to me. It still seems like image classification. You said:

> What if every time Ford/GM/Toyota substantially changes the look of their cars, your classifier no longer recognizes them

My answer was probably incomplete, but I took the above to mean that cosmetic changes to vehicles mean that classifiers no longer identify them as cars, and this detrimentally modifies the behaviour of the car.

Whilst it's trivial to envisage a scenario where your problem is solved systemically (sufficient training data for a new chassis released in advance or something), it seems like it would be possible to train based on "things we expect to see from any car".

As far as I know, that's how all of the existing methods operate. They seem to have a hierarchy for decision-making:

0. Is there an object around me which I need to consider? If not, continue to monitor for one whilst operating the vehicle within the parameters of road signs and conditions.

1. Is this an object which has a predictable path based on either its signals or the expected behaviour of a car on this part of the road / operating within the parameters set by the road signs I can see?

2. Is this an object which is operating safely despite not falling into category 1?

3. Is this an object which I need to take action to avoid?

Which is to say that it ought to be possible to "fool" a Tesla with a non-car object behaving in a similar fashion to a car. The Tesla sees an object, not a car.

argonaut · on Nov 14, 2016

"it moves at the speed a car moves at, it's in the road, it's overtaking on the right hand lane" is video classification, which is not solved. In fact, at least how you described it (you could probably change the problem statement to avoid this), this would involve an ML model that must learn a model of physics - also unsolved.

You've just specified a manually hardcoded set of decision rules. This is not machine learning, and is incredibly brittle.

georgespencer · on Nov 14, 2016

I think we're talking across one another.

I had thought that in your original post you were agnostic about the methodology for identifying a car, but were remarking that, in a world where it's possible to do it using whatever form of classification, it would be possible to 'stump' any reliable model by modifying the appearance of a car. I'm observing that any model for classification almost certainly would not rely on aesthetics.

> You've just specified a manually hardcoded set of decision rules. This is not machine learning, and is incredibly brittle.

I'm pointing this out to illustrate that the technology already deployed to solve this problem does not get confused by aesthetics.

argonaut · on Nov 15, 2016

I was talking about deep learning. The comment I was replying to was making the specific problem seem as if it were easy. Certainly there may one day be a classification technique that does what you say will do. But you may as well have said there will one day be a perfect classification technique that will just perfectly output steering angles, end thread. What use is there in conjecturing about perfect unknown classification techniques? Not to mention that there is no guarantee such a perfect method would not rely on aesthetics. Even if the train set has more than just aesthetics (e.g. video of cars in motion) maybe this perfect classifier would just cheat and rely on aesthetics, you don't know.

So I'm pointing out the methodology you suggested is not currently feasible, or is currently widely considered by the community to be the wrong practical approach. Because theoretical solutions will not solve self driving cars.

guiambros · on Nov 12, 2016

> ... humans don't need LIDAR to drive, why should computers?

That being the case, wouldn't we be limiting self-driving technology to the same traffic-related death rates as humans? Maybe 10, 20% better, but still fundamentally close.

For self-driving cars to be truly successful, the death rates will need to be an order of magnitude better. An incremental improvement won't convince governments and the public at large to trust their lives to an algorithm running inside a black box.

To be an order of magnitude better, you'll likely need to go well beyond simply processing pixels, including LIDAR and other sensors.

closeparen · on Nov 12, 2016

>Maybe 10, 20% better, but still fundamentally close.

You are asserting that human drivers are essentially perfect, because in 80-90% of their crashes, the information necessary to avoid the collision just isn't available visually.

That seems like an incredibly optimistic view of human drivers.

Collisions happen because a driver does not look at, see, understand, or act appropriately on available visual signals. Or they are going too fast / following too closely for their actions to be effective.

mikeash · on Nov 13, 2016

A huge number of traffic deaths are due to alcohol. An autonomous system that's as safe as a sober human would improve safety by a factor of 2 or 3. Many of the other deaths are due to distraction, inattention, or slow reaction times. Get rid of those and you can probably see an order of magnitude improvement with something that is nominally "no better than a human driver."

aetherson · on Nov 13, 2016

Would you (while sober) get into a car driven by an autonomous system that was demonstrably more likely to get into a crash than the average sober, awake, healthy driver, but less likely to get into a crash than the average driver?

Honest question.

I don't think I would.

Belenus · on Nov 13, 2016

But you can't predict what would happen. Driving even healthy and sober, something could happen. Assuming you're in a more developed version of the self-driving cars, machine-learning has most likely come a long way since the beginning. The car/network of cars would have learned by now that the command: "Stay on the right side of the road." Doesn't mean to stay on the right side of the road but it's okay to hit a few cars or pedestrians." They would have learned, or have programmed in them, that hitting cars or people is not good. Machines don't have a moral sense, and hoping that they are not completely sentient, this means that they don't have opinions, meaning that if you don't like this guy, you can be a little rude to him. And my last point is that the network of cars, all communicating at once, would learn how to be safest. Done.

mikeash · on Nov 13, 2016

Good question. It would depend on the exact numbers, I'd say. I do sometimes ride in (or drive) non-autonomous cars more dangerous than the best drivers, after all.

I don't know how relevant it will be, though. I suspect that the fact that computers are always attentive, can react instantly, and follow the rules consistently will make them much safer very quickly. But we shall see!

inimino · on Nov 13, 2016

Interesting question.

As long as you still have the manual option, it doesn't really matter. You can just get in and drive if you want to.

As the option becomes more common, obviously impaired driving becomes less common.

aetherson · on Nov 13, 2016

It matters if, for example, the car is an Uber and you aren't allowed to drive it.

Belenus · on Nov 13, 2016

Will Uber still stand when self-driving cars become common?

Shivetya · on Nov 12, 2016

LIDAR is just another form of seeing, just not as we are used to as people but combined with cameras they two would compliment each other. Relying on only one is a fool's gambit.

LIDAR won't go blind from white trucks on sunny days. LIDAR won't suffer snow blindness or inability to track in conditions where humans don't see well, like heavy rain at night. You add in visual acquisition to fine tune what you are detecting if necessary; perhaps to read signs and tell what color the traffic light is, maybe even to see brake lights. To know a floating bag is just that and not a solid object, to see that road is washed out or such.

sliken · on Nov 13, 2016

Actually in snow, fog, rain, and related humans (and cameras) can do pretty reasonably. Things like brake lights and running lights can be pretty severely distorted and you still know the approximate distance to the car in front of you.

Lidar on the other hand is a point source of light (not from the environment) and any distortion makes it less likely for said light to return to the sensor. So with stereo vision (or radar) you can get a relatively accurate distance for a car in front of you. With lidar some fraction of the returns will be bouncing off fog/snow/rain between you and the distant object.

Because of this disadvantage lidar based systems might suggest a slower safe speed, and risk rear ending from humans.

qq66 · on Nov 12, 2016

The problem is that humans do primarily (not solely) use vision to drive, but they have mental models about the other driver. I remember once I was at a red light and when it turned green, I looked at the oncoming driver (far away) and thought, "that guy is too into his music" and didn't accelerate. Sure enough, he goes right through the red and slams his brakes halfway through the intersection.

philjohn · on Nov 12, 2016

One of the central tenets of autonomous vehicles should be that they are BETTER than any human driver could be. Relying on vision because, well, it works OK for humans doesn't cut it in my book.

sangnoir · on Nov 12, 2016

> The argument against LIDAR is just this in reverse, humans don't need LIDAR to drive, why should computers?

That is a terrible argument: birds don't need ailerons either.

ChuckMcM · on Nov 12, 2016

I think you meant to say that birds don't need a vertical stabilizer to change direction. Bird wings have pretty awesome ailerons built into them.

revelation · on Nov 12, 2016

Unless self driving vehicles drive better than humans, and LIDAR or 360° vision seems a requirement for that, they will not succeed.

k_lander · on Nov 12, 2016

article from yesterday:

http://spectrum.ieee.org/cars-that-think/transportation/sens...

hopefully this will make LIDAR more economically practical