Essentially, using the gridPositions.json to populate a Voronoi Grid (toggle display of the <canvas> in your browser). Then your cursor position is used against the grid to find which shape encloses it which maps back to the source JPEG which is then displayed.
I watched your video. It does not seem to explain how the fingertip position in each photo was calculated, which it seems to me is really the heart of the magic.
Yes, I was wondering the same thing. The difficult engineering problem here is first assigning pointing locations for every photo in the database. Because he didn't explain this step, I guess he just manually assigned each location for every photo.
Not a bad guess but I'd be interested to actually know the details. Making it all by hand would be problematic for reasons beyond the sheer volume. How would one find a deep, diverse set of images of people pointing? How did the creators ensure they had images to cover all possible pixel positions within a certain proximity? It seems like it would take reviewing many more than 900 images to produce a final set of 900 that includes even coverage.
Another guess here, but it could have been crowdsourced using Amazon Mechanical Turk. Assuming a conservative 0.05$ by picture, the total cost would be 901 * 0.05 = 45.05$.
Yeah, this seems likely. Here's how I'd do it: create a webapp similar to this one to show to your Turkers. The difference would be that it would show an image at random, and the Turker would decide whether or not someone is pointing in the image, and if so they would click where the person was pointing. For each image, take the mean position of mouseclicks. This becomes a seed for the Voronoi diagram.
You can tell he/she "cheated" a bit for some the parts of the canvas that didn't get many seeds. For example, if you move your mouse within one of the large cells in the bottom left, the image just moves to keep up with the pointer. :P
You don't need a different picture for every single pixel: if the picture is large enough, you can easily crop it and use it to cover few thousands pixels.
They could have had help from friends. Also, the design of the application makes it very easy to add new images to the set (just throw a new entry into the JSON file containing the x and y coordinates of the finger, and the script does the rest whenever the user loads the page).
If anyone doesn't care to explore the DOM themselves, this will show the Voronoi diagram and make the photo on top of it transparent. For Chrome, Ctrl+Shift+J and paste it into the console.
Interesting example of how a hard problem (image recognition/understanding) can be faked for some cases by solving an easier problem (lookup table, a few hours of manual intervention.)
doesn't really work for me with Chrome on Windows 7 - an image flashes for a split-second, then it's back to "Finding pointer" (I'm not even touching my mouse)
Same here. Just flashes the picture really fast in the latest stable release of Chrome under Windows 7. Works beautifully in the latest Firefox though. Windows IE isn't supported at all apparently (I tried IE 9).
Are the developers willing to share some insight into how this works? The picture selection is right on and there seems to be a good number of pictures at that.
It looks like Chrome is firing the "mousemove" event when it shouldn't.
One way to make it work is to right click where you want it inside the square it and then left click anywhere outside the square.
In the Javascript looks like the error could be patched in Flasher.js line 72; save the last event.clientX and event.clientY and then check if the mouse coordinates have actually changed.
Fun :). I did something similar with a bunch of photos of a stuffed monkey. I made a music video instead of an interactive web app, though the latter would probably not be too difficult since I wrote the software using Processing. http://sporksmith.wordpress.com/travels-of-code-monkey/
It's probably easier to automate the picture harvesting here via an intensity filter. I'm still stumped about how the finger-recognition was accomplished in the OP.
The images are adjusted. I was able to get the same picture 3 times in a row by moving cursor slightly, but the finger was always aligned exactly the same on the cursor, even though I had moved it.
Just wasted about 20 minutes of my life playing with this. I would really like to know the technology behind. Is there some sort of ML going on, or is this human trained? Either way, very entertaining and scary accurate.
Something as low tech as having a predefined map of {location : picture} would do it. Break up the screen into a grid of NxN quadrants, like 20x20. Then just find 400 pictures with a finger near each quadrant. Translate cursor to a quadrant x,y; look up picture. Done.
Good "point" -- the right question posted to mturk/crowdflower/etc. could have knocked out that json file for relatively minimum time/cost. Oh, the things we humans engage in.
That's what I assumed. If I just moved my mouse a couple pixels I usually got the same pointing image back. I think they just divided it up into regions and then load the appropriate image for each region.
However, if that's all they're doing, it seems odd that it takes so long to load... Maybe some of that's just a built-in delay to keep from constantly cycling images from someone who's bumping the mouse.
The delay might be to build up suspense. The two–three seconds it spent loading I spent thinking "how could something that's essentially xeyes possibly take this long… ohhh haha very cute"
There are serious problems around the upper right corner. I did not get any results from there. I'm wondering whether the pictures are originally pointing that coordinate or are they resized+translated to some offset?
You'll have to move the cursor left a bit, to get the same image, but with visible finger. The images are indeed offset so that in the upper right corner the finger actually doesn't fit in the frame.
One suggestion is to move the jquery from jquery.com to your domain for two reasons:
1: It will be one click less, for those who use noscript.
2: It will allow your site to survive the case that jquery.com changes anything, or goes down.
What is the point about this beyond showing pictures where the finger is next to my cursor? (Why is this popular?) It's not more interesting than a circumference search. You could implement that with a simple rtree.
Haha, very clever. Although I seem to be running into problems whenever I put the mouse into one of the 4 corners of the screen. It can find anything more centrally located, though (apparently?)
Essentially, using the gridPositions.json to populate a Voronoi Grid (toggle display of the <canvas> in your browser). Then your cursor position is used against the grid to find which shape encloses it which maps back to the source JPEG which is then displayed.