There are deep cnn models for doing joint recognition detection at around 60fps ...

gnipgnip on Feb 26, 2017 | parent | context | favorite | on: Self-Driving Cars Have a Bicycle Problem

There are deep cnn models for doing joint recognition detection at around 60fps (at 300x300) at reasonable accuracy 75%.

This is arguably the most expensIve part of the pipeline, but we sHould have atleast two more architectures out by NVDA/AMD before any product comes out.