Well done. I have been working with animating 2d images with a hand constructed depth map in the past to a decent 3d effect[1]. The illusion when guided by your mouse as in the OP seems to be much stronger then in left-right strafing that I did.
One point that can be worked on, as pointed out by other people here is the occluded pixels (this is an issue when an image has sharp changes in distance from the camera). You simply can't show what wasn't seen by the camera, so you have to fake it. In the pre-baked animations I did, the occluded pixels were duplicated and then manually fixed by me in Photoshop.
Perhaps the solution here would be to capture and store images from several perspectives. These could then be used to generate the depthmap using what algorithms Lens Blur used, and also to interpolate the occluded pixels when viewing the photo from different perspectives.
Surprisingly well faked in face of the fact that there's no real 3D information in the pictures.
There simply cannot be new 'pixels' appearing that were (i.e. before applying the effect) hidden from objects in the cameras line-of-sight. This is evident upon closer inspection (look at edges around the approximate middle of the DOF). The trompe-l'œil then falls a bit apart.
Interestingly, I did not notice above with the IOS7 background parallax; I'm wondering why? Special images, or a stricter constraint on movement?
It's exactly that - single image, and a depth map calculated from a series of shots made upwards. Both these are bundled in a fake-bokeh image. There is obviously no more pixels. However...
If you choose your subject wisely, the effect can be pretty believeable, using a simple displacement map.
On iOS though it's something different. There's no depth map in there, just a flat image moving counterwise to your hand movements.
It sounds like there are two issues here. One is how the depth map is generated, and the other is how the resulting image file is formatted. For the former, several still images are collected while the camera is moving, which provides parallax which can be used to generate the depth map. For the latter, I don't know, but it would certainly be possible to bundle both the depth map and multiple "layers" of photograph that could be used to actually expose new pixels in the background when the foreground moves.
There is an app for iOS - seene.co. The amount of movements you have to do to capture enough pixels is prohibitive for my taste. I think that google has nailed it - it's super simple.
As for storing the layers - you would only have the "from above" pixels, and only a few. Probably that's why there is only a LensBlur in their app in the first place.
If you just want a small displacement effect like on depthy - then the key is no sharp edges in the depthmap. We will try to tackle this in one of the upcoming updates...
Now that you explained the IOS7 trick, it's obvious. (Given the environment: background filling image on a physical device movement as opposed to a image on a website).
There is more pixels captured compared to a single shot image as you said: "series of shots made upwards". So it captures some pixels that is hidden when the camera moves upwards, but if the simulated parallax is bigger than the original camera movement then there will be still missing pixels. Probably this could be improved by doing bigger movements with the camera, like with other 3D reconstruction software.
Indeed -- there are a bunch of spots where, if you look closely, you can see the foreground texture being "repeated" in the background texture, when you've moved your cursor to the edge. I wonder if a better kind of pixel-filling-in algorithm could make this more believable, or if the problem is with the inherent inaccuracies of the depth map.
But still, as a whole it's remarkable effective. I'd like to be able to appreciate it without moving my mouse -- I wonder what kind of camera movement function, at what speed, would make for the most unobtrusive effect that would still give the perception of depth?
I'd like to be able to appreciate it without moving my mouse
Oculus Rift?
Not just different left/right angles for each eye, but also can rotate the angles by tilting head. That would be a spectacular way to view still photos in VR.
Given it only takes panning the image slightly and adjusting the parallax a bit, I think that's probably doable. It would be the modern day version of this: http://en.wikipedia.org/wiki/Stereoscopy
From my understanding there actually is 3D information reconstructed via sfm from the hand jittering. Only few points and of questionable quality but enough for the foreground extraction (for blurring).
Yeah, it appears to be a regular JPG with a depth map added to the metadata. Originally Google uses this for refocusing, but the parallax is a neat trick
Also, as the depth map only has a single depth value per pixel, you can get aliasing around the edges of the objects in focus as they have a halo around them.
It might be helpful to separate the code that creates the parallax effect from the angular app that displays the website & examples. This is a really cool trick and it would be nice to be able to easily reuse it. For those looking, it seems like the "magic" happens in a couple of spots (I think):
Has anyone managed to replicate the lens blur effect that's being utilized in the new Android camera app? Or at least know what research paper it's based on?
Nice work! Are you doing any 'content-aware fill' style magic on the invented occluded pixels?
I wonder what else can be done with the depth data... Fog is the obvious one. You could potentially insert elements into the scene that knew when they were behind foreground items.
I'm sure some kind of point-cloud or mesh could also be derived but not sure how good it would be.
Funnily enough I nearly posted to /r/android/ earlier "I wish you could save the generated z-buffer" - it didn't occur to me to actually look!
As for the fill, if the depthmap is of good quality and without sharp edges - there is no need for that, unless you go berzerk with the scale of displacement.
Damn this is cool. Even though there's some smearing to make up for the lack of information, especially behind the railings on the deck, the effect is striking, and subtle enough to be believable at a casual glance.
This could be an effective differentiator for a real estate web site featuring property photos. Or even an online store where product photos are done in this way.
Interesting that on the shelf image, it's the slightly diffuse reflections of the items on the shelf in the wood grain that breaks the illusion for me.
No, because “blurriness” (low local contrast) may indicate something besides a particular depth.
Consider, for instance, a head-on photograph of a print of a shallow-focused photo. The region that print embodies will have plenty of variation in contrast, but exist at a single depth. Also, consider that blurring increases in both in front of and behind the center of focus; how could we tell which depth the blurring indicated?
Something similar to what you suggest is, however, done in software autofocus, which can take repeated samples at different focal distances to clear things up. Maybe that’s something to think about, e.g. for a static subject. http://en.wikipedia.org/wiki/Autofocus#Contrast_detection
Yup, there's no simple way to recover depth from blur: there will be featureless regions where you can't tell if there was any blur -- in other words, there is no universal way to tell if a region has gone through a low-pass filter or if it is natively low frequency.
Would heuristics work well? I can think of a handful, but none really good.
I had the same thought... how this effect could enhance porn picture sites. Given the dominance of "tube" porn sites, I doubt it will be as big as it could have been when picture sites dominated. However I could still see some niche sites using it to "get a better look" or on-page to attract clicks.
As I understand it, the depth map is created by slowly moving the camera while taking images of the same objects. It probably wouldn't work as well with a moving subject (even small movements).
One point that can be worked on, as pointed out by other people here is the occluded pixels (this is an issue when an image has sharp changes in distance from the camera). You simply can't show what wasn't seen by the camera, so you have to fake it. In the pre-baked animations I did, the occluded pixels were duplicated and then manually fixed by me in Photoshop.
Perhaps the solution here would be to capture and store images from several perspectives. These could then be used to generate the depthmap using what algorithms Lens Blur used, and also to interpolate the occluded pixels when viewing the photo from different perspectives.
[1] http://fooladder.tumblr.com/