Hacker News new | past | comments | ask | show | jobs | submit login
Lytro Lightfield Gallery (lytro.com)
158 points by ideamonk on May 29, 2011 | hide | past | favorite | 41 comments



Could this be used for completely automated exemption of objects based on the focus range? I think, with a clever algorithm which analyzes the sharpness of all these layers it migght be possible. I don't know if this technique also expands to moving images, but if so, maybe it could also be used to automatically composite those. Without the need of a green-screen whatsoever. Basically, you would be separating the image layers based on distance instead of chroma.


There is a nice talk by Levoy. He is trying to build a motion camera:

Why is Sports Photography Hard? (and what we can do about it)

http://research.microsoft.com/apps/video/dl.aspx?id=146906

So for sports photography it seems very useful. Replacing the green screen doesn't seem to make sense.

For macroscopic objects depth reconstruction with two cameras like in the Kinect seems the better alternative.

Object excemption seems to be a nice idea. I can't remember having read anything about this. In principle it should be possible to recover an unsectioned stack of images.

One can then use an iterative algorithm to subtract in-focus information from on slice of the stack from each of the other slices and end up with a deconvolved image of sectioned images.

Then one could delete one object in the stack and recalculate a superposition of blurred sectioned images to recover a reconstruction representing the object without the image.

This is quite complicated. Just imagine to remove a wine glass from a scene. One needs to delete all the rays that went through the wine glass and bend them such as though the wine glass wasn't there.

One can argue that polarization and absorbtion effects will be very hard or even impossible to handle correctly.

Certainly light fields contain A LOT of potential.



This is pretty much the same technology that Adobe demoed back then. Todor Georgeiv and Chintan Intwala also refer to Levoy et al. in their papers, e.g. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104...


The initial applications of this are interesting, but it's what comes AFTER that will be really cool.

This is the capture part of the capture and display of true 3D images.

What do I mean by 'true'? Imagine a screen that works like a window.

If you think about a window or a mirror as a display screen, you can imagine that every point on the screen is a tiny hemispherical lens, light exits the screen in all directions due to these lenses. By producing light in every direction (as opposed to just perpendicular to the screen + diffusion) you could let your eye decide on what to focus. Additionally such a system would be view-angle agnostic, so you could look from the side and see a wider 'view' into the scene (again noting this works for n viewers).

Such a display would be complex to implement, but even if you had one you'd need image capture such as Lytro is providing to make it work.

Exciting times!



Hacked some python script together for parsing this and generated some HTML file with the data out of it. Still not sure when to change the frame, the whole thing is of by some pixels.


Should i post it, not sure about the legal component. I'm from germany so reverse engineering is allowed in some cases, not sure if this is covered here.

So here is the parser script. The example.html is very messy sorry for that. https://gist.github.com/997861


This is HN! Post the darn thing!!


Surely those are just standard depth-maps as used every day in VFX for compositing? Each pixel has a z-depth as to how far it is from the camera, and using standard compositing software (Nuke) you can blur just a slice (depth-wise) of the image based on the z ranges?


Presumably the data format is based on their light field viewer: http://lightfield.stanford.edu/aperture.swf?lightfield=data/...

(note this downloads 12 Mb of data)

You can look at the source: http://lightfield.stanford.edu/aperture.html


Anyone know anything else about the company? Founders, investors, etc? The only thing I could dig up is that Manu Kumar has Lytro's Twitter account on his "portfolio" list [1] and that the domain was, interestingly, created in 2003. Formerly known as "Refocus Imaging".

EDIT: And a few job listings [2] [3].

[1]: https://twitter.com/#!/ManuKumar/portfolio/members

[2]: http://www.indeed.com/q-Lytro-l-Mountain-View,-CA-jobs.html

[3]: http://www.jobnum.com/Manufacturing-jobs/296905.html


Ren Ng, a former student of Marc Levoy's (at Stanford), started the company. Ng was widely believed to be one of the rising superstars in computer graphics/computational imaging when he decided to leave academia and start Refocus Imaging.

The company seems to be doing well, and recently changed names to Lytro in order to not be pigeonholed into refocusing applications only.


A bit off-topic, but do you think that's the right approach? A niche field like that can be really good for a company, and if you want to broach another field, don't you think it's a good idea to start a separate brand and image for that field instead of generalizing away from your niche?

McDonald's bought Chipotle but didn't rename it "McDonald's", neither did they rename McDonald's Chiptole. McDonald's is for hamburgers and Chipotle is for burritos even if the money ends up in the hands of the same people.

It is almost never beneficial to merge existing brands (unless one of the brands has a horrible reputation), so why should someone generalize away from a successful niche application and lose the associated branding instead of just starting a new brand for the new field?


I don't know -- I'm a computer vision researcher, not a marketer ;-)


And thank God for both those facts. Too many of one and not enough of the other.


Couldn't you do something like this with Kinect? Depth information is all one needs to get a plausible "software focus" effect. Not sure how effective it would be outdoors though.


You could hack up something very roughly like it with Kinect but it wouldn't work as well, it'd be a neat hack but not actually usable for anything serious.

Kinect's image capture is very low resolution (even by today's cellphone camera standards), it doesn't give you depth information at a per-pixel resolution and even ignoring those issues in addition to depth information you also need a source image which has critically sharp focus across the entire viewing range (you can't selectively focus in software that which was captured out of focus with a standard digital sensor), which means using a very small aperture (large F-stop value). So it'll be very difficult to capture anything but still-life images because the small aperture means a long exposure time, and thus motion blur if anything moves. Granted this is already less of an issue with Kinect because the sensor in it is so tiny that getting out-of-focus areas is not that much of a concern, but the cost of that is that the image resolution is also atrocious.

Once you get up to usable sensor resolutions, if you're already limited to taking long exposures of still-life images on a tripod, you might as well skip the IR depth perception and just take a series of wider aperture pictures at different focus plane levels, focus stack the results and preprocess the image series for blur levels to work out the relative depths of the in-focus bits of each source image. At least doing it that way you can use a DSLR to get quality photos.

Neither of these is a true replacement for what they are doing here, though.


For the Kinect to work you need a surface where the infrared spot pattern is imaged. For most of the things we look at the Kinect will be sufficient.

However one captures a lot more information with the light field camera. For example transparent things like smoke, fog, glass and things with weird optical properties like polished steel or the Tiger eye mineral with Chatoyance will be captured by this camera.

This gives the photographer tremendously more artistic space. Just imagine photographing a close up of an eye with the Kinect technology.

One can argue that the light field camera will maintain a quality in the image one could never achieve with Kinect based systems (without a lot of photoshopping).


Their method captures a light field instantaneously at the expense of spatial resolution. They place a microlens array where the film would be, followed by a sensor. Each microlens forms an disk on the sensor with angular light distribution. Developed in Mark Levoy's group http://graphics.stanford.edu/papers/lfcamera/

I guess Ren Ng is behind the company as he was first author: http://graphics.stanford.edu/~renng/

This method works nice for photography where all the dimensions involved are much bigger than wavelength. I work on something like that in fluorescence microscopes. I can tell you, it is much harder when you have to consider wave optics.

Here is a related talk: http://www.youtube.com/watch?v=THzykL_BLLI


The idea has been around since the turn of the century.Lippmann proposed it [1]. But the first guy to BUILD a light field camera was a Russian in 1911 [2]. He built a light field camera AND display using a copper plate drilled with 1200 pinholes and reconstructed the light field of a lamp.

BTW, your comment illustrates that much of "computational photography" is just re-application of tricky imaging tech from other fields. Nothing wrong with that, but something to keep in mind.

[1]http://www.tgeorgiev.net/Lippmann/index.html [2]http://www.futurepicture.org/?p=34


Cool! I only heard of Lippmann's holograms before. And now I would really like to see the 3D pinhole camera!


Of course they're not really holograms - they deal on the "pencil" ray level and not the wave optics level - even so... seeing the first 3D reconstruction of a light field (the apparent image of the lamp) must have been totally thrilling.

Film still has many advantages for plenoptic stuff. It's a large, single-use sensor.


I was refering to Lippmann holograms or interference colour photography ([1]).

His technique from 1891 preceded the first "real" photographic film and is based on a standing wave pattern in the photographic emulsion.

This hologram only works from one viewing angle but it can be reconstructed with white light.

The holograms needed to much integration time to be of practical use. And apparently they are still very colourfull.

[1] http://holography.co.uk/archives/Bjelkhagen/Belkhagen.pdf


If your interested in this and you have an Iphone you could try this app: http://sites.google.com/site/marclevoy/

It captures a video wihle you move the Iphone camera and combines it into an image that looks like it was captured with a big aperture.

I don't have an Iphone and have never seen this app in real live, though.


My coworker went to GoogleI/O and got a Galaxy Tab 10.1. With an Android app that meets that description. I cannot take another loom or get it's name over the weekend, though.

It also had a feature that automatically saved the grams of a video that had a face wit a smile.


Tried it just now and it is merely a "technology demo" rather than a proper app. Having played with it for 20 minutes I wasn't able to make a single image that was worth saving, but still it was a buck well spent.


What they are doing is simply grabbing frames from DSLR video; a short 1-2 second video recording with manual focus shifting from one subject to the other, and just saving a number of frames ripped out of that short video clip.


This wouldn't work with the picture of the bird flying http://lytro.com/gallery/content/lytro_50_00087.php


On the other hand, it does look like they're interpolating between discrete frames. In their fourth example (http://lytro.com/gallery/content/lytro_50_00090.php), take a look at the piece of confetti to the top left of the middle head when the focus slider is about 1/3 of the way up from the bottom. It has a sharp edge, but there's also a halo around it from a frame where it's out of focus.


The images are created by interpolation from an image consisting of discs. The raw data would look like this: http://www.eps.hw.ac.uk/~pf21/projects/page1/assets/good-fnu...


Kind of interesting. Somebody has to do it I guess. But its not as flashy as they think - refocus the picture? Or focus right the 1st time I guess. Zoom? Enough megapixels and what else? Nothing I suppose.

We've seen some really interesting stuff on HN about tracking thru crowds, reconstructing images from fragments etc. If these folks can do anything like that, they aren't showing it.


It's extremely interesting, for two groups of reasons: (1) creative possibilities (playing with depth of field is my favorite part of photography); (2) market applications.

Regarding (2), if I understand the paper properly, this should allow for a massive increase in lens quality while also allowing for lenses to be much smaller. Both are worth tons of money... together, it's massive. As a guy who carries around a $1900 lens that weighs 2.5 lbs, because of the creative options it gives me that no other lens can, this appeals to me greatly!


Big lenses are always better as they collect more light. If one uses a 10M pixel sensor in a light field camera, one will have to reduce the resolution of the output image by a factor of 10 in both directions (depending on what microlenses one chooses).


Before I read the paper, I was adamant about "more light is better." But read the paper:

"We show that a linear increase in the resolution of images under each microlens results in a linear increase in the sharpness of the refocused photographs. This property allows us to extend the depth of field of the camera without reducing the aperture, enabling shorter exposures and lower image noise."

You're right that you still need good, small sensors to enable good, small lenses, but my ultimate point is that digital camera sensors scale with advances in silicon. Lens technology is much, much slower to advance. The more of this we can do in software (and thus, silicon) the better.


Related to this and presumably not common knowledge is that there was recently a breakthrough in camera CMOS sensor technology. A few companies now offer scientific cameras (price tag 10000 USD, for example http://www.andor.com/neo_scmos) that allow read out at 560MHz and 1 electron per pixel readout noise as opposed to 6 electrons per pixel in the best CCD chips at 10MHz.

This means one can use the CMOS at low light conditions and at an extremely fast frame rate (the above camera delivers 2560 x 2160 at 100fps). You will actually see the poisson noise of the photons.

Unfortunately representatives (the few I spoke with) of those companies don't seem too eager to bring these sensors to mobile phones.


Sounds like you guys should be writing the marketing prose for those guys.


Refocusing the picture after the fact isn't just about being able to focus "right" after the fact, so it's not fair to just compare with "focusing right the first time". With normal cameras, when you focus, you lose information, and it is simply not possible to do what they do in their demos, namely, to capture a continuous range of fields of view. (With several cameras or more than one lens, you can capture multiple discrete fields of view, but not a continuous range like this.) This is only possible because they're capturing 3-dimensional information about where each object is.

Granted their demo isn't impressive, but they're underutilizing their technology, and honestly I can't think of a better demo either, but don't be misled. This light field camera is capturing far more information. Meaningful information. I wonder if it's possible to like, create 3D models of objects in these images? That would probably be more "computational camerawork". What's impressive that that could be done after the fact.


How about macro photography. When you get really close to something your depth of field shrinks. This lead to the invention of focus stacking. If instead of having to take 4 pictures, you can now take only 1, you can capture incredible things.

https://secure.wikimedia.org/wikipedia/en/wiki/Focus_stackin...


This is indeed an interesting topic. Especially in high numerical aperture objectives incredible things are possible.

One can put a diffractive optical element infront of the sensor and obtain 25 instantaneous images, each at a different depth.

http://waf.eps.hw.ac.uk/research/live_cell_imaging.html

Couple this with high resolution techniques and your at the current research front and could possibly solve the question:

What happens in a synapse?

It is theoretically possible to image at 40nm resolution with 100fps and therefore see the transport of vesicles.

One can expect important discoveries on how the brain works using these techniques.


Mark Levoy (and his group) do all these things. However, if one wants to track through crowds one needs a much bigger aperture. So one would combine many small cameras into an array as opposed to a microlens array infront of the sensor.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: