Hacker News new | past | comments | ask | show | jobs | submit login
Restoration of defocused and blurred images (2012) (yuzhikov.com)
215 points by lobo_tuerto on Oct 18, 2015 | hide | past | favorite | 46 comments



Many aspects of CV and signal processing in general produce almost magical results. It's a little less magical when you think back to stats examples of fitting a curve to some signal + (known) noise process. For linear models, the noise term can get really bad before simple regression won't persevere. Least squares in one form or another finds lots of use in image processing.

Images are just functions, and almost all of the mathematical techniques that you would use to analyze one domain hold for the other. Certain perturbations such as camera jitter are easier to deal with as "undoing" them is tantamount to assuming some underlying regularity/structure on the signal and filtering accordingly. Others, such as removing an occlusion, are harder. Humans do it well thanks to our power of inference, learned from the litany of visual examples that we take in over the course of a lifetime. It's not trivial getting an algorithm to visualize what should have been behind the person who photobombed your insta, but we do it somewhat naturally.

For occlusions and finding relationships between partially overlapping scenes, really interesting things are happening with deep learning. For noisy images, techniques continue to improve. Compressed sensing and signal recovery is an active area of research that's already paid huge dividends in many fields, especially medical imaging. I can't wait to see what becomes possible in the next five years. And, as has been noted - this article is dated. There are already more powerful techniques using deep learning for image super resolution and deblurring.


How does an upscaler in your TV work? How does it create information out of nothing?

Imagine that you are learning a deep neural network on a huge amount of movies. You have access to all of the lovely Hollywood movies. You downscale them to a 480p resolution, and then try to learn a deep neural network to upscale the thing, maybe upscaling only 16x16 blocks of the image.

It works amazingly well, and looks like magic.

Maybe there was no visibility of pores on the face in the 480p downscale, but your model can learn to reproduce them faithfully.

Sony has access to billions of movie frames in extremely large resolutions. Their engineers are definitely using this large amount of information to create statistical filters which upscale your non-hd, or maybe your HD to 4k HD. These filters work better than deterministic methods in this article. Why? Because the filters know much more about the distribution of the source (distribution of values of each individual pixel). They have exact information that one instead tries to assume (author in the article assumed that something in the source - be it noise or something else - behaves according the to Gaussian). If you know how to find the proper distribution, instead of assuming it, you can move closer to the information theoretical limits.

Just imagine how fast these filters can be if you put them on an FPGA, it also explains why TV sometimes cost more than $2k.

If you knew that your images would only contain car registration plates, you could definitely learn a filter that would be very precise in reconstructing the image when zoomed, you'd now find CSI zooming a little bit more realistic :D


> you could definitely learn a filter that would be very precise in reconstructing the image when zoomed

Yes, your result would be a very clear image of one possible license plate. An algorithm may be able to do slightly better than a squinting human, but ultimately you can't retrieve destroyed information.


Of course you can't retrieve destroyed information.

But information is not destroyed by perfectly unpredictable (uniform) distributions of noise.

> Yes, your result would be a very clear image of one possible license plate.

Fortunately there are methods of evaluating how well your statistical filter works, if it's precise enough you'd be fine with indeterministic nature of your filter. Or even better, you could generate all of the highly probable licence plates, instead of having only one - given by your deterministic algorithm.


From what we know about the software on these TVs, upscaling there works using whatever code the cut-rate programmer could 1) google and 2) efficiently integrate into the product with minimal fuss.


Not really. Upscaling is hardware, not some ridiculously slow software solution.

https://community.sony.co.uk/t5/blog-news-from-sony/inside-4...

Upscaling is a very state of the art technology, not some layman's solution.

There are firms that specifically targeted upscaling as their product and made millions with their state-of-the-art tech. Currently upscaling is in the rise again with 4K TVs. Back in the days they made some incredible chip solutions, sold them expensively to Sony, Samsung and similar. Sony realized they can, with all of their resources (super-HD movie database) make incredible upscalers.

Just imagine that Sony has stored whole movie The Walk (distributed by Sony Pictures) in your TV in 4K resolution, the moment this movie is displayed on your screen through some lower resolution sources, they find it in the database and display the 4K content. Of course, that's highly inneficient and memory intensive, thus, they use statistical models to efficiently store movie material and have fast chips to quickly approximate the real upscale.

This will then, if the sample (number of movies) is high enough, work well on all of the movie content.


Note that this works for out of focus photos, not for enlarging tiny details of in focus photos.

They may look similarly blurry, but are mathematically very different.


How's that? Can't you think of a pixel as the average value of it's subpixels (=blur)?


In focus-blurred images, the blur spreads the information about certain pixels throughout a region of the image. (Objects appear larger and fainter because of this when they are blurred. It's why you can shoot through a mesh-wire fence with a wide lens aperture, for example, and not see the fence.) This 'spread out' information is being used to reconstruct an approximation of the original image.

However, when images are downscaled, all the information from 'subpixels' is kept within the reduced-resolution-image-pixel, and all replaced with a single RGB value. There is no 'spread out' information from those pixels. The 'blur' only applies to a region with sharp edges that exactly coincide with the unit of information. So, none of the data from the sub-pixels remain. To reconstruct the pixels that were within you have to essentially guess based on the context of the surrounding pixels.


If the information that belongs in one pixel is spread out over 50x50 pixels, according a known mathematical formula, you can reconstruct it pretty well.

If you just have one pixel with the average of what it covers, then that is all you have.


You can! The problem in that case is it's not a convolution (blur) per se. Each pixel is an average of subpixels P(1)=S1(1)+S2(1)+...+Sn(1), P(2)=S1(2)+...+Sn(2) -- but as you can see there are no elements in common between pixels, which is the case with convolution. That is, in the deconvolution case there are as many variables as unknowns, whereas in the upscaling case you're creating unknowns out of a single variable.

To estimate those subpixels then you're going to be forced to make additional assumptions; if you assume they are independent you would simply estimate Sk(j)=P(j). The traditional (easiest) assumption is that the image is somewhat "bandlimited" -- it does not have many variations (frequencies) faster than once per pixel (so no equal or faster than subpixel variation). If this were the case, you could reconstruct the subpixels perfectly [1], save for some noise. But this is not always the case (and fails spectacularly when you have edges), resulting in upscale blur. So simple linear upscaling algorithms by reaching a compromise between blur and edge enhancing.

If you want to do better though, you have to use non-linear kernels and have good underlying models for your image content. A promising approach is to use machine learning/NNs: http://engineering.flipboard.com/2015/05/scaling-convnets/

[1] https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampli...


"Subpixels" are an artifact of the display type you're using. Although there may be individual colour components of an image pixel, they aren't restricted to a region of that pixel. And in most digital images - those made with a colour filter array, as opposed to a multi-chip, multi-shot, or Foveon-type sensor - the colours at each individual pixel are just an educated guess based on the surrounding pixel values; the RGB values are not captured separately, and attempting to increase resolution depending on which part of your screen is lit up by how much in order to render the colour that should appear at the corresponding pixel can only introduce artifacts. (If green is brighter than red in a particular pixel, it only means that green is brighter than red, not that the detail is laterally displaced.)


I'm author of this article (and SmartDeblur http://smartdeblur.net/), thank you for posting!

Also, you can look at the second part that describes practical issues and their solutions: http://yuzhikov.com/articles/BlurredImagesRestoration2.htm And if you have any qustions, feel free to ask me.


Wow, that's very impressive, and very un-intuitive (to me) that it's possible at all.

So, who is the first to dig out some pictures that were 'redacted before publication' and can be de-obfuscated this way?


Law enforcement have large libraries of images of child sexual abuse. Sometimes the abusers appear in the photo but blur their faces. There's probably some work happening to identify those abusers.

Here's one famous example https://en.wikipedia.org/wiki/Christopher_Paul_Neil

There's a bunch of other image processing stuff that can be done. Identifying location from sparse clues is important. Identifying wall paper patterns, or coca-cola bottle labels gives clues.


> Wow, that's very impressive, and very un-intuitive (to me) that it's possible at all.

Indeed, there is no free lunch. This is only possible because the restoration algorithms are making some rather strong assumptions about the original picture. For example, they assume that evey blury area in the blury picture has a corresponding sharp edge in the original picture.

It works fairly well on pictures with a lot of sharp edges such as the ones in the article: buldings, text etc.

I'm guessing that it wouldn't work as well on natural landscapes or human faces, because they don't usually have edges likes this.

For example, in this restored picture,

http://hsto.org/storage2/9d8/554/c15/9d8554c156e63a213797502...

the sky in this picture contains edges that do no exist in the original picture. The algorithm is trying to find edges where there are none. This is a very visible impact of this assumption.


These really de-focused images are interesting for the article but if you can take a slightly out of focus image and make is laser-sharp then it is extremely useful for all kinds of photo and video applications. It's really tough to fix a shot that was slightly off when you're going for professional quality where you can't tell it was sharpened.


Highly agreed! I would be really curious to see if someone could take a slightly defocused picture and run it through this to see what it would produce / if it would introduce a bunch of undesirable effects or not.


I have tried the referenced implementation (SmartDeblur) on a few (similar) slightly de-focused images. From memory, I had the artefacts without any noticeable improvement. I don't think I even saved the results. YMMV and I would almost certainly try it on photographs with a larger blur than mine (approximately a disc kernel with radius of 0.8px). Since then I've been working on other methods with mixed results.

I've found a few things out: Firstly, many articles mention some form of Richardson Lucy with regularization as if de-convolution with a known Point Spread Function is a solved problem. The regularization can indeed produce better results, but usually introduces new parameters. I've found, with real photographs, I have had far better results using less statistically sound methods. This is especially true as I've found that RL, more often than not, begins to diverge quite early.

The reality is that there are papers surrounding deconvolution and blind deconvolution that get apparently good results and work poorly in practice. It is only once implemented that the flaws of the algorithms proposed are exposed. This is compounded by the fact that we have a wonderful error metric with this inverse problem: the euclidean distance between 1. the solution convolved with the point-spread function used for deconvolution; and 2. the original (blurred) image. With many algorithms, lowering the euclidean distance decreases the perceived quality by introducing more artefacts. Nonetheless, I have only seen a handful of papers that include this metric when considering a proposed regularisation scheme, and this is one of the only times when this particular metric makes any sense in computer vision -- if your deconvolution is working, and your PSF is correct (which it is in many cases as the blur is synthetic) then the euclidean distance is the measure you want.

I have since switched to least squares using L-BFGS-B for fitting which has lowered again the Euclidean distance and produced the best, most natural-looking results by far on real photographic images. Unfortunately, estimating the PSF is difficult for small out-of-focus blurs. Optics applied to an ideal lens would suggest a disc kernel, but at this scale other factors are playing a part.

I'll probably do a writeup of this when I'm done -- though I don't know when that will be. I'll also be putting any code on Github in due course. In the meantime if anybody has any pointers or is interested in further details or source code please grab my email address from my profile.


It's neat that this is possible. It was demonstrated in the 1960s, but nobody could afford the CPU time back then.

The intensity range of the image limits how much deblurring you can do. If the range of intensities is small, after blurring, round-off error will lose information. Also, if the sensor is nonlinear or the image is from photographic film, gamma correction needs to be performed before deblurring.


I've used Marziliano's[1] blur metric to reject images with motion blur recently. It is very fast and quite accurate in distinguishing blurred and non-blurred images.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.7.9...


That looks really useful. Could you provide a link to your/an implementation?


Note: 2012.

As some people have mentioned in the comments already, there's been great work since on blur estimation at siggraph (particularly for motion).


Wondering what would happen if you apply this after enhancing (upsampling) something like what's shown in here as magic kernel thrice:

http://www.johncostella.com/magic/


FYI: The magic kernel isn't taken nearly as seriously by people in the field of signal processing (almost my field). I'm always surprised as to why it keeps popping up.

One of the many articles on the internet explaining why the magic kernel isn't really that magic:

http://cbloomrants.blogspot.com/2011/03/03-24-11-image-filte...



Strangely enough, the author only mentions total variation denoising in passing as a feature of SmartBlur. I would say this method is one of the most common, especially when your image has sharp transitions and lots of solid regions of color (e.g. pictures of buildings). I wrote what is effectively one of two of the fastest TV denoising algorithms and implementations out there: https://github.com/tansey/gfl

The way to use it in image processing is to basically think of each pixel as a point on a graph, with edges between its adjacent pixels. Then you apply a least squares penalty with an additional regularization penalty on the absolute value of the difference between neighboring pixels. The result is regions of constant pixel color.


It wasn't too clear but I wonder if the author was referring to deconvolution under a Total Variation prior -- this is a little different to deconvolving and then applying TV denoising or just applying TV denoising.

Either way, the results of overdoing it with TV are the same: cartoony images with large regions of constant colour. The difference is that incorporating TV within iterative deconvolution reduces some compression artefacts and removes some of the ripples around large discontinuities shown in the author's pictures.


I agree that the staircasing effect is definitely the biggest drawback of Total Variation. In the "Smoothed" picture the noise is removed but the results are blocky.

The first way to deal with it is to take into account higher powers of the differences, e.g. using a linear combination p-norms or a Huber function.

The second way is to take into account second order differences. This promotes piecewise affine instead of piecewise constant functions. You can go further and look at third order differences, but the improvement is minimal.

Other than being more complex, the biggest downside is that all of these methods have some new parameter(s) to tune.


> Other than being more complex, the biggest downside is that all of these methods have some new parameter(s) to tune.

They're so fast to run though, that just doing warm-starts and a huge solution path (or grid in the case of additional penalties) with a BIC selection criteria is a pretty decent way to auto-tune the parameters.


This is a very old link. Same feature has since landed in Photoshop


And Photoshop is expensive and is now rent-ware, which means Photoshop is a non-starter, so the feature might as well not exist for me.

I've been looking for this article ever since I saw the link the first time it was posted, so I am glad to see it! Plus, it's interesting, and you get source code (unlike Photoshop)


This reminds me of MaxEnt, which could be mentioned, too. http://www.maxent.co.uk/example_1.htm


Deconvolution is a very cool subject! A sizable portion of my dissertation was dedicated to applications of linear algebra tricks to deconvolution and denoising procedures for image restoration.

https://etd.ohiolink.edu/ap/10?0::NO:10:P10_ACCESSION_NUM:ke...


So all those tv shows where someone just clicks "enhance" we're accurate after all?


Well, yes and no. I had exactly the same thought as I was reading the article, but the bit about noise is important: in the presence of even a tiny bit of noise, the de-blur algorithm collapses pretty rapidly. And the "enhance!" scenes in TV and movies are of images that are likely to be noisy.

They're also often un-detailed due to pixelation/low resolution as much as from blur, which violates this algorithm's assumptions, so that would be another reason that "click enhance" wouldn't work.


A friend has a software product that does image deblurring. His focuses mostly on motion blur. Depending on the type of motion, it can work quite well.

https://www.blurity.com/


Can you measure the blur that I see when I do not wear glasses, and apply the inverse function to an image, like my computer monitor, so that I can see clearly without glasses? What sacrifices would be made -- dynamic range?


Are there security implications? If the last example was a blurred out license key or address for instance, this technique might be able to restore it.


Blurring or mosaicing is not very secure at all [1]. To be more safe block out sensitive information with solid black. (Just remember to not leak any information in the metadata, like an embedded thumbnail.)

[1]: https://dheera.net/projects/blur


Woah, lesson learned. Thanks for the link.


Who takes pictures that are that crappy anymore?


Automated processes, quick snapshots that were taken under insufficient control, pictures taken from moving platforms or of moving objects... Maybe the picture was intentionally taken with other things in focus than the ones you are interested in now.

Honestly, how your comment came to be escapes me. Can you seriously not see the utility in those approaches? Do you really think blurry pictures are "not a thing anymore"?


99% of the public using their crappy mobile phone cameras.


Surveillance cameras.


telescopes?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: