For those who think it's just another lame DL based instagram filter...
The method proposed in the paper(https://arxiv.org/abs/1707.03491) is mimicing a photographer's work: From taking the picture(image composition) to post-processing(traditional filter like HDR, Saturation. But also GAN powered local brightness editing).In the end it also picks the best photos(Aesthetic ranking)
Selected comments from professional photographers at the end of paper is very informative. There's also a showcase of model created photos in http://google.github.io/creatism
My ex-wife is a professional photographer. She does weddings, boudoir, and portraits mostly.
Here's the thing. If she had say, 2 shoots in a week, maybe a total of say 8 hours of shooting time, the selection, editing, and styling of those photos could take 3x or 4x as long as the time spent shooting. She would have to be editing constantly.
I've always thought a perfect application for this sort of technology would be a model trained on that particular photographer's style, which goes through a batch of photos and selects the best candidates and then presents the user with a few choices for each photo it selected and styled.
I think the ultimate would be a NAS with this capability embedded. The photographers I have met through her over the years seem savvy about photo tech, but don't generally use online storage solutions or know much more than how to admin a WP site at best. SOHO solution would be ideal imo.
An on-prem storage solution specifically targeting these features at professional photographers would sell.
Storage is always a problem with every photographer I have met. They keep stacks of external USB drives or burn archives to DVD. They don't trust or want to use online storage.
A package that provided these models, trainable on that photographer's particular style, with autocropping, watermarking, etc. would sell very well among this group.
I bet you could make millions. Or even tens of millions. Great for a start-up. But Google doesn't bother with such small fry---especially if it comes attached to a complicated hardware business model.
At these small numbers, they are more likely to just give away access to the software for free.
Oh totally, this isn't something I would see Google doing, and I agree that Photos is definitely a more sensible place for this to start showing up for their consumers en masse. This would definitely a targeted startup type product.
If you had a system to:
- batch ingest new photos
- stylize them to your own style
- categorize them
- autoselect candidate photos for final tweaking
- back up final album photos to local storage and S3
- archive to Glacier the rest of the photos from the shoot that will likely never be touched again
You would have a product that I think would be very attractive to professional and semi-professional photographers.
I thought of a cool application . You could use this to generate desktop photos for desktop environments that are either localized to a place you are near or the user could type in a location that they want to see .
Or an actual photography class app where a user could tell google exactly where they are standing on streetview and have the app tell them the best direction/angle/settings to take their pic and walk them through WHY it is as such
It can't do the latter, except in a very basic way of pointing to proximity to the training set. ML is great at imitating, but good look getting it (or the people programming it) to be able to articulate any of the underlying principles. This is a perennial problem in the arts, which might be why the examples on this page are pretty ho-hum from an aesthetic point of view.
Yeah, it's more akin to Auto Tone/Auto Contrast/Auto Color in Photoshop: often a quick and easy way to tweak levels/curves on a photo that's already decent, but much more varied results on anything with mixed or more atypical colors and levels of light.
Why would they say that? You can't have HDR with a single exposure.
I think you're trying to say professional photographers don't apply tone mapping to single exposures, but that is false. There are times when it's not 'technically proper' to apply tone mapping to a single exposure, but it makes for a better picture.
Really it sounds like you're trying to make a snarky comment referencing how some professional photographers complain about novices over using HDR on flickr. But who cares. You can have a professional quality photograph that uses what some pros think is incorrect tone mapping. Just like how you can have a novice hack together a working professional-level product that many professional engineers think is trash. If it works and meets the criteria, who cares.
HDR worthy of the name seems to require multiple levels of light gathering, to me. Either exposure time or aperture needs to vary. If you don't, you either blow out on bright light, or you have too much noise in dim areas. You might not notice if the scene is well and reasonably evenly lit though.
You never vary aperture when taking bracketed exposures to merge to HDR. The depth of field would vary from exposure to exposure, making things inconsistent when you merge them. It's shutter speed you vary.
My point is that with 2005-era DSLRs, I had to use HDR techniques (or strobes) to, say, correctly expose a room and the environment out a window. Shooting into the sun, I'd have to use HDR, polarizers, and/or ND grads to expose both the sky and the foreground. With my D750? I've got 4 stops more dynamic range and the situations where I need HDR have almost all gone away. I can get the same thing from a single RAW exposure.
These days, HDR only seems necessary if you want to have ridiculous tone mapping where you pull in your global highlights to be within a stop or two of your global shadows and still keep the noise levels down.
Was there any control for, e.g., the dynamic lighting filter being the factor that carried most of the water, rather than the other manipulations?
FWIW, I did think your tool did a good job finding good compositions within mundane scenes. Looking at the originals, I have to admit that in many or even most of the cases, I wouldn't have spotted the opportunity - which points to an area I can improve my craft.
Still lame, just cropping and increasing saturation/contrast. Some of the pictures in the blog post look worse than the originals, they are artificial.
Thanks! Interesting work, indeed. I found the showcase photos appealing and nice. I briefly skimmed through the paper with hope to find some interpretation of the “aesthetics”, but couldn't find it. What is meant under “all aesthetic aspects” and “aesthetic quality”? Did you perform a critical analysis of the network? How do you know that the network learned “aesthetics”?
When a topic like self-driving vehicles comes up, the Hacker News crowd is mainly in favor: “Creative destruction! Disruption! Go go gadget robots!” Not surprising. How many Hacker News readers drive trucks or taxis for a living? How many regard commuting as an enjoyable hobby?
Photography, on the other hand, is a very common hobby in the tech community. And the comments here seem to reflect that this effort strikes a little close to home: “Those pictures are lousy, if you find them appealing you have no taste! Just because they're 'professional' doesn't mean they're good! Machines can’t replace human judgment, they have no soul! I bet that machine had a lot of human help!”
Tech people may tell you great stories about meritocracy and reason, but in the end we are just emotional monkeys. Like the rest of humanity.
Those of us who can accept this may at least aspire to be wise monkeys.
Ha, I'm one of those techies you're talking about. Spent an embarrassing amount on camera gear, and I'm always "that guy" with the ridiculous rig at someone's casual party.
But you know what? I love this shit. I like photography for the good photos, not because I'm building my self esteem on top of it. I want to capture the scene or the moment in a way that, IMO, does it justice. If this technology makes that easier, and gives it to more people - great. Progress.
And what I really love is how it subtracts a huge amount from the price of entry. It's like when software synths came out and made it possible to make music without $10k+ worth of MIDI hardware. What followed? An explosion of creativity which I've been luxuriating in ever since. Get the ability to create beautiful images in as many hands as possible, says I. For me, at least, "more beautiful images in the world" was the whole idea.
Honestly, this "creative destruction" has already to some extent entered the photography field. You can take very capable, often professional looking photographs for the vast majority of circumstances with just your phone.
There are still many situations where more professional expertise is required (portrait photography is probably one of the best examples, knowledge of proper lighting is still required for best results). But I've gotten the impression that overall demand for photographers -- and pay -- is slower these days, largely due to technological advances. (See articles like this: http://www.nytimes.com/2010/03/30/business/media/30photogs.h...)
In a way, this is fine though -- as you say, it gives power to more people, which is great.
Per the parent post's point, to be honest, I perceived much of the criticism to be more that they really couldn't think of a good way to use it in their own hobby level work. I personally can't think of a good use case. On the other hand, the most useful application I can think of for this is more in reverse: social media companies with huge image libraries their users have graciously "donated" to them, could use this AI algorithm or something similar to pick, or construct, the most "professional" looking image of anything they can identify from their vast library, and sell it or license it to media / content publishers for a relatively low cost. Such could be but one more competitor to any professional landscape / scenery photographers out there.
It's true that a phone today an produce a photo that's technically pretty good. I've got a friend who's had shows of iPhone photos, and won numerous awards with them.
This doesn't mean that someone armed with their phone will take good photos. There's still the matter of composition, as well as capturing the "the decisive moment"[1].
The tool described in the OP provides an automated means of finding some worthwhile composition within a library of images, thus providing an aid to someone less skilled at composition - although it's not going to produce something from nothing. If the photographer failed to point the camera in the direction of the best photo, then they've failed.
But I can't see any way that automated tools can, after the fact, help us to capture that decisive moment.
Would frame extracted from a 120 fps video might help for this. If it's possible to learn composition, I think that selection of the decisive moment is reachable.
Talking as a semi-pro (I've put in some money into cameras and lenses and spent a good bit of time on photo editing), this is a bit underwhelming. For landscapes (which this seemed to focus on), I've found that opening up the Windows photo editing programs and clicking 'enchance' or Gimp and clicking some equivalent already gets you most of the way there in terms editing for aesthetic effect. The most tricky bit is deciding on the artistic merit of a particular crop or shot, and as indicated by the difference between the model's and photographer's opinion at the end of the paper, the model is not that great at it. Still, pretty cool that they did that analysis.
The other huge thing that is lacking is content. Image processing is only 10% of good photography. The rest is about conveying an idea to your audience. It can be humor, documenting an event, making people aware about societal problems, taking people to somewhere they would normally not be able to access, or seeing the world in a way that most people don't see.
Truly good photographers don't just produce beautiful photos; they produce meaningful photos.
The tech is great but I'm not a fan of the title of the article ;)
Exactly this. I don't want to say getting good mountain shots and applying some filters doesn't require experience and esthetic touch. Yet, for me as a consumer, the showcase looks just boring. Photography is about capturing the moment, not about applying some filters.
Photojournalism is almost always about the story and what the picture evokes. It could be a simple pic of a boy covered in dust, but made evocative because the dust is from a bomb explosion that killed his father and not from playing in the park.
But in case of a lot of landscape photography, the content is intrinsic to the picture itself—the photo does not get its value from anything external to it. Likewise for portraits.
Not sure I agree with that. There's good landscape/portrait photography, and there's really great landscape/portrait photography.
Great landscape photography takes you to an interesting location or interesting angle. A post-processing algorithm cannot take you somewhere interesting.
Good landscape photography makes you say, "Wow, that's beautiful!"
Great landscape photography makes you say, "Holy crap, what is that? Where is that?"
I significantly prefer the article's photos to those.
Buildings are boring as is snow, photo of person or animal / shadows not landscape, and over edited. Now, I suspect as someone that's looking at a lot of photos those pinged some novelty feeling in you, but that does not mean they are objectively good.
By comparison the article's photos (and those linked: http://google.github.io/creatism) generally evoked that I wish I was there feeling.
I see what you mean, though I do like buildings. (But I always had a soft spot for Hong Kong.)
I predict a similar divide will crop up in computer generated music: very soon normal people will prefer their pleasant sounds, even while more discerning (semi-) pros will deride their lack of artistic merit for a lot longer.
Computer generated art will shine as Gebrauchskunst first---eg get a 'professional level' soundtrack and video editing for your youtube video shot on a phone.
> Great landscape photography takes you to an interesting location or interesting angle. A post-processing algorithm cannot take you somewhere interesting.
This isn't just a post-processing algorithm. It also picks the place from Google Street View.
At least 4 of those photos are heavily post-processed. While they may look nice without the post processing, they would not look anywhere near as nice.
I struggled for years to make good photographs until I learned to post process well enough. For an average viewer, the difference is one of "side glance and ignore" and "wow, great photo".
It's instructive to look at photographers who post their originals. Even ones that appear to show an "interesting angle" can be quite distorted to show just that angle (as in if you travel there, you will never see the nice angles you saw in the photo).
> But in case of a lot of landscape photography, the content is intrinsic to the picture itself—the photo does not get its value from anything external to it. Likewise for portraits.
Any photo is a tiny square cut out of an infinite number of 360 degree spheres taken at a single point in time. For every photograph there will always be much more outside of the frame than in it.
That doesn't even begin to touch on things like lighting for portraits - but to think that even a landscape photo is anything but subjective expression is naive at best, dangerously so at worst.
I'm not arguing that machine learning can't take great photographs, just that there are great photographs (and so, also many not-so-great photographs).
I suppose one could get into deep philosophical discussions about if a machine intelligence can make art, if it does so, does it have to be regarded as a higher level of intelligence, or does the art have to be regarded as inferior art, etc. Interesting stuff, but not my point at this time.
Agreed completely. I'm an on-again/off-again hobbyist photographer, but my personality tends to focus on the technical and processing stuff most easily. As such, I completely recognize the limitations of this aspect of photography.
I personally enjoy learning how to use tools so I like messing with processing techniques. Since I'm very much an amateur, it's an effective way to improve the objective quality of an image. Still, it has made me very aware of the limitations when it comes to improving the subjective quality of my photos.
Essentially, I can save an underexposed RAW photo or use levels, curves, and masks to improve the quality and balance of lighting and color...but it will never fix a poorly composed shot or help me to better understand how to get a point across.
>Talking as a semi-pro (I've put in some money into cameras and lenses and spent a good bit of time on photo editing), this is a bit underwhelming. For landscapes (which this seemed to focus on), I've found that opening up the Windows photo editing programs and clicking 'enchance' or Gimp and clicking some equivalent already gets you most of the way there in terms editing for aesthetic effect.
First, semi-pro means you've made money through photographs - not spent money.
Second, I heavily dispute the utility of those "Enhance" filters. They often do make a photo nicer, but they never make great photos.
If you're right, that's still a fantastic world which might open up. In my simple view, to make a good picture before the invention of photography, you had to be able to:
* Choose good scenes AND make them pretty AND paint well.
With photography, to make a good picture, you have to be able to:
* Choose good scenes AND make them pretty.
With this technology, to make a good picture, you might have to be able to:
* Choose good scenes.
Reducing the number of skills required massively increases the number of people who can do them all.
> I've found that opening up the Windows photo editing programs and clicking 'enchance' or Gimp and clicking some equivalent already gets you most of the way there in terms editing for aesthetic effect.
As someone who's also spent a fair amount on gear, the post-processing part is the thing I've had most issues (read: spent least time with) to get right, and while I haven't tried all software, I haven't found a one-click solution of sufficient quality.
I've been hoping for a while that ML might lead to something like this. I would like if I could spend all that time taking photos, and no time fiddling with them afterwards.
Could you recommend any "enhance" feature that you've encountered that you think is okay?
Automatically selecting what portion to crop is impressive, but just slamming the saturation level to maximum and applying an HDR filter is the sign of "professional" photography rather than good photography.
As someone who lives in a relatively rural area with similar geography to much of the mountains and forests in these pictures I have noticed previously how professional pictures of these areas have a similar feeling of over saturating the emotion.
It's interesting to see algorithms catching up to being able to replicate this. However when you mention these kind of abilities to photographers, they get defensive, almost like you are threatening their identity by saying a computer can do it.
> they get defensive, almost like you are threatening their identity by saying a computer can do it.
It's a fairly common reaction from most people. Hayao Miyazaki, director of Spirited Away, got very upset after being shown a demo of AI doing animation:
I feel like that misses a lot of the context of the video. It was animating mutilated bodies in a prototype for a horror game. It's no wonder it was upsetting.
I admit I didn't read the linked article, but I did watch the video and speak Japanese pretty fluently.
His reaction of disgust isn't to the general idea of computer-generated animation, but to the specific animation he's being shown, which features a corpse thrashing around on the ground, and specifically in relation to having a severely disabled friend.
It's entirely possible he's not a fan of computer-generated animation in general, but that clip doesn't indicate much one way or another.
He doesn't seem like he feels threatened in his identity, if anything that blank, off-guard stare of the others seems that way. He basically asks them wtf is wrong with them, and they just don't know. They "didn't mean anything by it".
As a hobby photographer, it's always a challenge to strike a balance between getting the real feel of a place and overdoing it. A faithful read from the camera sensor may accurately measure the light levels, but it does not accurately share the feeling of being there.
Subjectively speaking, the world always seems more saturated through my eyes. Objectively speaking, the eye has something like 20 stops of dynamic range while paper has 9 at best...
The over-saturated emotion comes, perhaps, because photographers are generally making art, not a scientific record. The approaches are extremely different. Art, in the commercial/social sense, needs to compete with all of the other art in the world; a lot of that art is impactful.
On the other hand, imaged thoughtfully and without enhancement, the mundane can be truly beautiful.
It's in any field where algorithms are catching up to humans. I feel like it's more of a testament to how humans are able to accomplish complex tasks, rather than an insult.
It is an interesting project and shows significant accomplishment. I'm not sold on the idea of "professional level" except in so far as people getting paid to make images. I am not sold because the little details of the images don't really hold up to close scrutiny (and I don't mean pixel peeping).
1. The diagonal lines in the clouds and the bright tree trunk at the extreme right of the first image are distractions that don't support the general aesthetic.
2. The bright linear object impinging on the right edge of the cow image and the bright patch of the partial face of the mountain on the extreme left. Probably the gravel at the left too since it does not really support the central theme.
3. The big black lump that obscures the 'corner' where the midground mountain meets the ground plane in the house image.
4. The minimal snow on the peaks in the snow capped mountain image is more documenting a crime scene than creating interest. I mean technically, yes there is snow and the claim that there was snow would probably stand up in a court of law, but it's not very interesting snow.
For me, it's the attention to detail that separates better than average snapshots from professional art. Or to put it another way, these are not the grade of images that a professional photographer would put in their portfolio. Even if they would get lots of likes on Facebook.
Again, it's an interesting project and a significant accomplishment. I just don't think the criteria by which images are being judged professional are adequate.
But the emphasis was on the processing, not the content of the images themselves, other than cropping. I don't think the intention at this point was to get into adding / removing actual physical elements from the scene.
Edit: I guess it depends if there were other images in their corpus that didn't have any flaws, but who's to know?
With the exception of the weakly snow capped mountains (part of photography is showing up at the right time) different crops would have eliminated the conditions I described. But to unpack my critique a bit further, an unwillingness to make compromises like the weakly snowcapped mountains and paying the price of an extra week in the field waiting on foul weather is part of how professional photographers get professional shots...and part of how the maintain their professional reputation is not publishing shots that don't live up to a high standard.
Or to put it another way, compare the images in the blog post to Carter Gowl's http://www.gowlphoto.com. Gowl produces a couple of images a year and those are good enough to charge a couple of hundred dollars per print. Even if the images in the blog post were appropriately high resolution, I don't see them commanding a similar price. YMMV.
The blog post itself actually confirms what you are saying. They say they asked professional photographers to rate their creations. 40% of their creations were rated at semi-pro or pro level. Looking at their plot only very few of them got a solid 4.0 (professional rating). So, the headline is kind of misleading.
Out of curiosity, I looked at the linked photos of Jasper National Park (one of the locations) on Google maps. They make an interesting point of comparison, https://www.google.com/search?q=jasper+national+park
In our modern era of digital photography, 99% of what makes a great photograph is the content, not the processing. Being able to make a technically flawless image is interesting, but to compare it to serious professional photography misses the point almost completely.
To be fair, there's plenty of technically good but uninteresting professional photography, too.
I don't know why but the "professional" label on this really irritates me. I'm curious to know how the images that got graded on their "professional" scale were selected for inclusion in the sample. Surely by a human who judged them to be the best of many? I'd love to see the duds.
I hope that one day our driverless cars will alert us when there is a pretty view (or a rainbow) so we take a moment to look up from our phones. Every route can be a scenic route if you have an artistic eye.
Other algorithms could detect whether a picture your friend posted on their trip is the sort of thing you would like and automatically like it for you.
Interesting how hi-res the photos of a small section of Google Street Car photo can be compared to what users see online; here's an example from the linked article:
Maybe I wasn't not clear enough, my point was the hi-res photos used in the research appear to be of higher resolution than what the public version of Google Street Car zoom offers.
When a photographer takes or edits a picture, she doesn't need to predict or simulate her own reaction. There is no model or training necessary, because the real outcome is so easily accessible. However, she is only one person, and perhaps can't proxy well for a larger group.
The model has the reverse situation, of course: it cannot perfectly guess the emotional response for any one person, but it has access to a larger assortment of data.
In addition, in different contexts it may be easier/cheaper to place a machine vs. a human in a certain locale to get a picture.
If my theorizing makes any sense, it suggests that this technology would be useful in contexts where: the locale is hard to reach and the topic is likely to evoke a wide variety of emotional responses.
A photographer does have to predict their reaction. I'm not a pro but I take dozens of photos before one I really like. I've heard pros take hundreds or thousands for each one that they let other people see. And it makes sense to automate cases like this where you have 40,000 photos that you want to edit.
Retouching is another field to play with - I am experimenting with CNN/GANs to clone styles of retouchers I like. If you are a photographer, you know that most studio photos look very bland and retouching is what makes them pop; for that everyone has a different bag of tricks. If you use plugins like Portraiture or do basic manual frequency separation followed by curves and dodge/burn adjustments, you leave some imprint of your taste. This can be cloned using CNN/GANs pretty well; the main issue is to prevent spills of retouched area to areas you want to stay unaffected.
"Someday this technique might even help you to take better photos in the real world."
So what? Maybe I missed it, but what are some potentially meaningful applications of this technology? What motivated this to begin with? Or are these questions that we even bother asking anymore?
I remember the first time someone showed me the Snapchat app -- it would make them look like a cartoon dog, or all these other real-time overlays. I thought, 'jesus, so glad we're all getting advanced computer science degrees so we can work on utterly useless shit like this...'
Well I don't really get Instagram or Snapchat either, but they created something that has made tens of millions of people happy, and turned a thousand other people into millionaires (or billionaires!) basically overnight. By that measure, what have you or I ever done...?
I think this research is spot on, and can't wait to have it on my phone. And I love taking photos the old fashioned way, too.
this is amazing, but 'professional photographers' aren't really the best arbiters of what a 'good' photograph is. Also, training on national parks binds the results to a naturally bland subject, no pun intended. While an amazing achievement, nothing shown here demonstrates ability beyond a photographer's assistant/digital tech adjusting settings to a client's tastes in Capture One Pro. Jon Rafman's 9 Eyes project comes to mind as something that produced interesting photographs, as does the idea to find a more rigorous panel of 'experts' (e.g. MoMA), or training the model on streets/different locations than national parks.
This is cool but I really don't get why one could call this actually creating "Professional-Level" photographs. It's more like a very good auto-retouch. There's still the matter of someone actually being there, realizing it is a beautiful place, and dragging a large camera with them and waiting for the right light.
I think some of these results are really lovely, the one at Interlaken is a perfect travel photo. Would be interesting to see more types of work this could apply to.
Saw a few people talking about retouching and studio work - I do a lot of studio shoots and retouching on my own, and would be happy to help or participate in projects. Feel free to reach out.
The first thought after going through all these photos was: incredibly stilted. It's amazingly impressive, but the human photographer will always be able to capture the subtleties that AI will miss. But very cool nonetheless
Instead of augmented reality I would call this "distorted reality". People will prefer to visit places with Street View than being there. Real reality is uglier
Would be interesting to see how well you could train this kind of thing off of a large catalog of lightroom edit data. to then mimic a specific editors style.
You're implying that the source photo was taken by a professional photographer, but these are all clearly sourced from Google Maps. There was no pre-existing framing because it is a 360 photo, and the exposure is probably automatically set by the camera. I doubt the people hiking trails with a camera backpack really spend much time configuring their gear- because it would take too much time.
Lately, there has been lots of talk of deep learning applied to create tools which can generate
requirements – designs – software code – create builds – test builds as well help with deploying builds to various environments. I'm excited for the future developments capable with ML.
If they're doing dodging/burning, then they could really use the processing on raw files instead of jpegs. The dynamic range is obviously limited when dodging/burning jpegs, as you can see from the flat clouds and blown highlights on the cows.
Great, not all we need is specialized machine learning inference accelerators in our mobile phones. I wonder if Google has even considered making a mobile TPU for its future Pixel phones.
Qualcomm added recently some improvements for running deep learnning in mobile, and google is also working in mobile nets that use less operations. I don't think you will see TPUs or something similar in mobile soon, but there are other small improvements that are being done now.
Unless someone can put a huge battery in a small mobile, forget about running big (and good) networks in mobile.
From the article the caption of the first picture was interesting: "A professional(?) photograph of Jasper National Park, Canada." Is that the open scene from The Shining? If so I wonder why the question mark, is Stanley Kubrick not a professional photographer?
The method proposed in the paper(https://arxiv.org/abs/1707.03491) is mimicing a photographer's work: From taking the picture(image composition) to post-processing(traditional filter like HDR, Saturation. But also GAN powered local brightness editing).In the end it also picks the best photos(Aesthetic ranking)
Selected comments from professional photographers at the end of paper is very informative. There's also a showcase of model created photos in http://google.github.io/creatism
[Disclaimer: I'm the second author of the paper]