We released a C/C++ computer vision library (Python binding to follow) earlier this month which let you perform face recognition at Real-time from IoT or Raspberry PI devices.
The library is dependency free, cross platform and should compile fine on most modern architectures with a C compiler.
Speaking as someone with limited and theoretical knowledge of computer vision, does CV usually overlap with Optical Character Recognition (OCR)?
I read through the readme and don't recall any references to identifying text, so I was wondering if and where a separation between OCR and CV might exist in the development process?
Yes, CV is always the first pass before character extraction. It envolve a lot of image processing routines including morphological operation like dilation, noise removal, etc. and finally blob detection[1]. All of the preprocessing routines are implemented in SOD.
A bit unrelated but I'm in a need for some sort of cat deterrent that will spray water at all cats except mine. Also it would be nice if humans were exempted as well.
I haven't done the legwork myself yet, but would this be an appropriate project to check out for this?
(As for identifying my cat I've been thinking about alternative solutions such as rfid necklace or something but it is quickly getting past 'quick hack').
Check out the talk "Militarizing Your Backyard with Python: Computer Vision and the Squirrel Hordes"[1] from PyCon 2012. He wired a water canon to a camera able to tell squirrels from birds:
> Has your garden been ravaged by the marauding squirrel hordes? Has your bird feeder been pillaged? Tired of shaking your fist at the neighbor children? Learn how to use Python to tap into computer vision libraries and build an automated sentry water cannon capable of soaking intruders.
A Bluetooth LE device similar to Tile[1] might be a cheap and reliable signal source in a package small enough to attach to a collar. If you had two Bluetooth receivers on either side of your camera's visual field, you could compare signal strengths during motion and positively identify your cat (assuming feline motion detection is a solved problem).
It might be easier to cheat and put a bluetooth or similar beacon onto your cat to suppress the water spray than try and distinguish your cat from others?
One of the goals is to create a safe place for my cat, on the porch or something. And still be able to fend off neighbor cats. She is getting old and has had fights on the porch which makes her scared to go out.
So I guess it depends on how easy it would be to limit the scope/direct the receiver to only activate when she is on the stairs which is probably the optimal target range for the porch (and not activate when she is lying just beside the receiver).
Thanks so much everyone for the answers! Can't believe I forgot about the squirrel presentation. My cat is all black and I'm not sure how I'd go about distinguishing it. A really colorful necklace perhaps, but it would have to be quite ungraceful to not be covered in fur when watched from the front/back (a second camera 90 degrees from the first could perhaps help).
In the QA in the squirrel presentation someone also mentioned OpenTLD, which seems to be superseded by CMT ( https://www.gnebehay.com/cmt/ ). Worth a look.
Pyimagesearch is great. I'm just a tinkerer / hobbyist in the computer vision space, but this is a site I come to time and again to get up to speed on topics that just seem harder elsewhere.
If your use case is archival (running this on many videos stored on disk) as opposed to realtime (running on a live video feed), we built a research tool called Scanner [1] for maximum performance in offline video analytics, e.g. see our face detection example [2].
Sounds like a great idea. I would be a little worried about how often I actually am not looking at the window I am typing in. Though, I suspect there are obvious patterns.
I think that is what worries me. For most purposes, this is fine. I do, however, sometimes have a transcription use case. This is fairly uncommon, but I worry that I would get bitten by it being taken away.
That make sense? I still think this is worth trying.
Great article. I'm working on a project to identify buildings, and it seems this technique would apply. Does anyone know of a good buildings dataset, or a pretrained network? 3 million images is no easy task to find... Thanks!
Sorry, forgot to mention this would be for street level recognition. I don't think aerial views would work for that application, perspective issues notably.
https://github.com/symisc/sod