Hacker News new | past | comments | ask | show | jobs | submit login
‘2001: A Space Odyssey’ rendered in the style of Picasso (bhautikj.tumblr.com)
473 points by cjdulberger on June 10, 2016 | hide | past | favorite | 94 comments



This is cool, but the frame-to-frame variance is distracting. I really want to see this reimplemented with temporal constraints a-la this paper:

https://www.youtube.com/watch?v=Khuj4ASldmU


I hadn't seen that before. This looks great.


I wish I had time to set up a computer and compile and run this

https://github.com/manuelruder/artistic-videos


I want to see them all now.


Oh wow, that looks amazing! I wonder how long it takes to transform those clips though...basic neural style transfer takes a few minutes on a standard gaming rig, so it probably takes a big machine or a few hours to do this.


Wow


I remember watching an interview with the creators of South Park in which they described the transition from animating using cardboard cutouts to a system with CorelDraw and other pieces of software which helped speed up the process. The bulk of the efficiency improvement came from carefully defining all the frequently used objects (characters, houses) once with movable components, and reusing those objects in the per-episode animation pipeline.

I can easily imagine an animation system like the one presented here enabling another massive improvement in animation efficiency. In the same way animation software allowed South Park to reuse pre-drawn objects, a deep learning system could enable south park to carefully define the entire drawing style just once, then generate complete episodes based on simple story boards and animation directives. Fortunately, South Park already has a significant amount of training data available, specifically every South Park episode yet produced.


An app that would let someone quickly animate a some dialogue in the style of South Park/The Simpsons/Family Guy would be an instant best seller. It would become the "Supernormal" of the late 20-teens. You could even do it by only outputting the equivalent of Wally Wood's 20 panels that always work.

http://www.comicbookscriptarchive.com/archive/panel-1/panel-...


https://xkcd.com/1205/

This would be a great proof of concept, in fact someone should work on this as a pet project, but I doubt it would be worth as an investment in the rest of the show's run. I bet they don't spend a lot of time on the basic animation anymore, and instead focus mostly now on one-off set pieces and visual effects


Another interesting problem would be the generation of filler pictures (I don't know the correct term). Normally there is a person who draws keyframes at a much lower framerate. Other animators then fill in the frames between to increase the framerate.


That's the problem with animation using bitmaps (or physical artwork): the in-betweens have to be manually drawn. Hence much animation is outsourced to studios - typically in Korea, and occasionally Japan - consisting of armies of animators and artists.

With vector graphics - where the lines and fills are mathematical objects - automatic 'tweening' becomes possible. Anime Studio (http://my.smithmicro.com/anime-studio-2D-animation-software....) is the zenith of this tech; there's also Synfig (http://www.synfig.org/cms/) and CACANi (https://cacani.sg/).

In Anime Studio it's possible to add all kinds of effects (including filter effects and motion blur) to animations, and to mix pure vector animation with cutout, or even frame-by-frame, animation.


There's a lot of work to take advantage of perceptual quirks of human vision that happens in tweens by humans that these algorithms don't account for (at least last I knew).

Sometimes a perfect interpolation, or even something based on a physical model doesn't feel right, isn't what is expected.


I used to play around with just such a tool back in the 80s and 90s called Fantavision

https://en.wikipedia.org/wiki/Fantavision


They're called inbetweens or tweens. According to recent HN article, they are outsourced to South Korea. Generating them with algorithm would be interesting, but often incorrect and against artist wishes. For example, objects in motion sometimes needs to be blurred; some characters need to have ghost duplicates; shapes get distorted and exaggerated.

I think at this moment it is not possible to instruct algorithm to take additional suggestions (artist ideas) into consideration when creating output image.



Disney and American TV shows have pretty mechanical approaches to this, and you can usually tell which are the keyframes when the characters seem to settle into a pose before starting a new one. But not everyone draws that way - try and find them in End of Evangelion!

https://www.youtube.com/watch?v=iounOj1VRUU


It's called in-between. There is certainly an art form of its own, for a simple mechanical interpolation won't produce visually pleasing animation. Disney has some research on parameterizing in-betweening. These days 2D animation are replaced with 3D and the art of 2D in-betweening might be lost in future. It'll be interesting if DN can learn animation styles from existing footage (e.g. Kanada-style).


"It means nothing to me. I have no opinion about it, and I don't care."

On the first moon landing, quoted in The New York Times, (1969-07-21).

https://en.wikiquote.org/wiki/Pablo_Picasso

Curious about his feelings regarding this work. (I find it beautiful.)


Funny! That's exactly how I think of Picasso's work.


Have you seen Guernica in Madrid?


Or for that matter, his "training period" works. I'm not a fan of abstract (Guernica, of course, is a symbol) and he was a very fine painter even taking off the weird eyes-out-of-face stuff he did (which is not to say it is bad, but it's not my style).


I have. And I still echo joe's comment.


Jesus what a curmudgeon. I love how solidly he bought into the whole pretentious modern 'artiste' personality that is still being emulated today. I guess having a world weary 'fuck you' attitude is probably the social signaling you need to do to get your stuff into high-end galleries.


Did he buy in, or was he the source? A bit like how you can hear Chuck Yaeger in every US airline pilot's calm drawl.


European bohemianism predates Picasso. I've always assumed that attitude was just another flavor of bohemianism.


have you read other quotes about the moon landing? in terms of pretentiousness this is rather mild, he didn't even try to speak for "mankind" or "all the people on this earth".


Actually he was a communist and painted portraits of Stalin. I suspect this was just a case of sour grapes.


Haha the guy was a really top notch troll.


I have two opinions: 1) I don't think cubism transfers well into a motion picture format, 2) I think these experiments, as they are currently, attempt to merge two styles and end up with neither, and nothing novel in its place; there is little Kubrick or Picasso in the final piece.

I think it's superficial and doesn't do either source justice.


This is a fair comment and does not deserve to be down-voted. In fact I would go further, this is the most honest critique here and not derisory to the CPUs that put together this video.

It is also okay to pay homage to a great work of art, to sample it, to parody it, to outright copy it, to even forge it or pass it off as one's own so long as that is artistically done. However I feel this video is more like passing something through an electronic mangle than art. It is craft rather than art, even if it is hi-tech craft.

Had this technique been applied to an original short film that had its own footage and own way of telling an actual story then we could have had a winner.


Specifically, what the algorithm seems to have done is leave the shapes of objects (furniture, hallways, etc) alone and decorate any blank space with random, Picasso-ish noise. It's chaotic, but isn't really Cubist. To be cubist it'd have to figure out some way to fuck with the rules of perspective much more than it does at the moment.


I think the motion aspect, at least what we see in the clip, forces a perspective that can't be avoided, breaking the style. Even if somebody did make a true cubist film it would be difficult to balance between intelligible and un-cubist or chaotic and true to style. Maybe. Hasn't been done yet so I don't know.


"It looks cool" is sufficient excuse for experimentation.


Yes, but it is no excuse to call it "in the style of Picasso" when it clearly isn't.


This is technical gripe. If you showed it to people with no explanation, and asked them whose style it was, a high percentage would say "Picasso."


And we can't see any face in the demo video, maybe the very signature of Picasso's style if you're not an expert.


this is such a quintessentially HN comment


Personally I'll just say the video's pretty fucking cool.

Referring to stepvhen's comment. I find it comical in all seriousness.

Even the must leisurely, casual or mundane topic, intended to be a refreshing change in colour (pun not intended considering the article but I'll take it) and\or conversation on Hacker News is subject to ridicule, assessment and biopsy.

It's why I like it here. Some of us are completely incapable of /not/ peer reviewing the shit out of everything. :)

Edit: words fail me, also spelt Stepvhen's handle wrong (sincerest apologies).


Really I was just trying to be polite in what I expected to be dissent. The video looks cool, but I don't believe the network used has any understanding (however you define it) of Picasso's style, as another commenter has stated. It's a little misleading, especially since there's a lot more going on in Picasso than geometric shapes and crazy colors.

Then again I might take things too seriously. There was only one other comment when I posted, I didn't know what the tone would end up being.

Incidentally, I chose this handle because others always spelled my name "Steven", when it's "Stephen".


> It's a little misleading, especially since there's a lot more going on in Picasso than geometric shapes and crazy colors.

> Then again I might take things too seriously.

if anything, too many people in this thread don't seem to be taking cubism and Picasso's style seriously either (signature faces? what ..?)

of course a deeplearning net can't actually do cubism.

if someone wrote a program to generate blocks of primary colours in pleasing ratios, really cleverly, it may look like Mondriaan to someone that has seen a couple of Mondriaan paintings. but everybody who knows what Mondriaan was trying to do, will instantly know that there's really no way today's computers could really perform the same process.


Picasso didn't invent one style - he invented lots of styles. So it's nonsense to say this video is "in the style of" Picasso.

It's more like an insta-Picasso plug-in for one particular form of abstraction.

It's interesting and unusual, and yes, it would be better with constraints.

I'm not sure I'd want to look at it on a big screen though.

>of course a deeplearning net can't actually do cubism.

One of the interesting things to fall out of this research is the realisation that a lot of art - even figurative art - is based on abstraction of visual invariants.

There's no reason that creative abstraction can't be automated to create new styles.

The difference when humans do it is the level of psychological insight and feel for what's visually important and interesting in a scene.

That can probably be automated too, but it's a very much harder problem.

The challenge for most developers in this space is that they have a much more superficial understanding of art (and music, and writing) than they believe they do, so a lot of content and detail that's important to experienced viewers gets ignored. The result is superficial lookalike output - pastiche.

Technically, the superficial output is an achievement in itself, but it's still a way short of being artistically innovative in its own right.


Hacker News' as an audience is quite critical, of critics and their criticisms when the criticism is seen to be facultative

I never took your comment to not be polite. As mentioned, as a community. We're all quite guilty of being overly logical and serious. It's part of the charm.

I wasn't sure if I wanted to start collecting down votes for mentioning the Phteven meme (we don't like meme's here, do we?). But I had suspected that might be the basis for the spelling of your handle!

I'll down vote my own comments and see myself out...


"I am well informed and unimpressed."

Also, for fairness, do the same with my comment:

"Unlike this status oriented nerd, I can relate to you and the author on a human level."


I agree that style of Picasso is cubism and this process isn't able to replicate that. But I assume it's an early experiment to show what's possible now.


I thought that was awesome, then I came here to see about half of the comments full of people criticizing and dismissing it. This is a really neat experiment, just appreciate it. The author didn't claim that this is some groundbreaking, perfect piece of art or technology.


1. I think cubism can benefit from an explicit time dimension, though exactly how it should be done, I don't know.

2. I agree, as they are in this video, but I think cubism could be used to capture the experience of entering the monolith even better than Kubrick captured it.


Cubism is entirely about working with the time dimension not against it though. In early Cubist works the group took still life as their choice of subject. You could set it up, hack out some geometric forms, move the easel/rearange the subject, add another layer/try reconciling forms, move the easel/rearrange again and so on until the painting was finished. The Futurists/Vorticists were trying to work at the same goal, representing motion using what had been a medium for subjects that moved fairly slowly/not at all. If you look at Picassos "The Weeping Woman" the perspective mis-match comes from moving.

I think the folly of the Cubists was to embrace the still life after the methods had been fleshed out. Most Cubist works are created from a handful of discrete images, this destroys the aspect of motion in my opinion. I've been working to capture proper motion using GoPro POV footage from various sources, rather then pausing the video I let the colours wing by and try and place them on the canvas. What I am choosing to perceive at the moment ends up represented (I avoid faces, thus I have floating forms without heads!), loose marks capture a door frame which has half of a tree-roadway. It's abstracted, but still based on a concrete sequence. The most effective are when a part of the frame is static throughout (a mounting bracket, bike, wiper blade, etc) because that imagery comes through only modified by changing light which provides a contrast against the jumble of forms and colours elsewhere.

The first step towards a Cubist film will involve multiple cameras shooting the same scene from multiple angles and recording the absolute time of the shot. [b]An algorithm that takes n images as an input and applies a deterministic transform to them (weighting certain images)[/b] trying to reconcile the overlaps, looking for prominent forms, picking/merging a colour to use, etc eventually mapping down to a single image. Assuming this could be developed the director would compose the scene from these multiple sources, perhaps giving a effect similar to [1]. The face of a lover is modified woven with images of a child and the image of meal prep on a sunny day. The conversation uses the word cheating in an innocent way and the focus snaps sharply back to the lover where scenes of cheating with the other/secretive behaviours/worry and paranoia are gently woven overtop eventually leaving the lovers voice as the only signifier of this conversation taking place.

But proper cubist imagery cannot come from a simple mapping of styles.

[1] https://www.youtube.com/watch?v=m4LLVIUh6ZM


I agree that neural style transfer doesn't mean much in terms of art, but it sure makes some cool looking stuff!

Also, as I've mentioned in my top-level comment, it's a great way to explore GPU programming and deep learning.


I remember when The Scanner Darkly came out there was a lot of talk about how they achieved the style of the film. Some of it was automated, but a lot still had to be hand done. I wonder if using deep learning systems we could achieve the same effect that film had with nearly zero human interaction.

For those that haven't seen the movie, here's the trailer: https://www.youtube.com/watch?v=TY5PpGQ2OWY


I think you should know `Waking Life` (2001) as the prototype for that stylistic approach:

https://www.youtube.com/watch?v=uk2DeTet98o


Possibly the finest painting software currently available is Synthetik's Studio Artist (http://synthetik.com/). Compared to Adobe's powerhouse software, it's relatively unknown, but that doesn't make it any less innovative.

It uses an algorithmic 'paint synthesizer' to generate brushes (with hundreds of presets) and auto-paint canvases, and is designed for animation (rotoscoping) as well as static artwork. The output can be reminiscent of the style of the movie 'A Scanner Darkly', but the software is hugely flexible. Here are a couple of rather amazing examples: http://studioartist.ning.com/video/auto-rotoscoped-dancers and http://studioartist.ning.com/video/dance-styles-animation-re...

Also, unlike most other 'painterly' software, the graphics are resolution independent - meaning that they can be scaled up to any size without loss of detail.


There is something that escapes me regarding this very cool neural style transfer technique. One would expect it to need at least three starting images: the one to transform, the one used as a source for the style, and a non-styled version of the source. This last one should give the network hints on how to transform the unstyled version in the styled one. For example, what does a straight line end up being in the style? Or how is a colour gradient represented? Missing this, it seems that the neural network should be able to recognize objects in the styled picture, and derive the transformation applied based on a previous knowledge of how they would normally look like. But of course the NN is not advanced enough to do that. Can someone explain me roughly how does this work?


Disclaimer: I'm probably wrong about this, this is just how I believe "Neural style transfer" works. I never tried this out and there's probably a lot of problems with my explanation.

I believe that this is done using Restricted Boltzman Machines[1] trained with the stylised image.

Think of it as a network that receives an image on the input layer, sends it to one or more hidden layers with less nodes (like an auto-encoder), and then tries to reconstruct the image on the output nodes. This is like a lossy compressor-decompressor overfitted to the stylized image.

Now, just pass the real image as an input to your network and the output should be a stylized version of the input.

[1]http://deeplearning4j.org/restrictedboltzmannmachine


It's right that it needs additional information to distinguish style from content, but they get that from selected layers from established, pre-trained neural nets for image recognition. I don't entirely understand why it works myself, but it seems to.


From the way you described it, you could consider the pre-trained network to be the "missing image". It already has an idea of what images should look like so when it detects an object the "style" is what makes that object different than the stereotypical one it's already modeled.


Right. But it's more complicated than that, the choice of layer(s) to use matters a lot, and I have no idea why they do as they do. Seems it's a bit of dark magic to get it to work well - takes lot of aesthetic judgements too.

I think Alex J. Champandard's implementation is probably the best one out there right now. It has a ton of knobs to twist and is very fast.


I'm learning about ANNs at the moment, so I'm not really getting this but I'd like to.

Aside from the 'missing image' part, what is the fitness example here for training? How does the training process determine what a good image is given there aren't many (any?) examples of a source image -> picasso mapping?


You need a neural network that can understand the content and the style separately. It needs to interpret the content out of the frame and then make a new frame with that content but in a new style.


It certainly has a wow factor, but once you pass the initial impression, it's interesting that the brain starts recognizing the content (motion of characters and objects) separately from the visual style, and even starts applying negative cubism filter so that we won't actually see the visual style anymore. (In other words, the brain treats those applied style as noise.)

It could be a way to exploit the mismatch of content and style as certain form of expression; but it may be more interesting if we can modify the temporal structure as well.


Like someone said about this on /r/programming:

>Pretty tight that computers can drop acid now.

Anyway, here's a direct link to the video for mobile users: https://vimeo.com/169187915


The big changes frame-to-frame certainly add to the "trippiness" but I'd love to see this where the value function (or whatever it's called for ML) prioritizes reducing the frame-to-frame diff so that I could actually watch the full length movie like this.


I am much more of an artist than a technology person and the rendering inconsistency the author refers to is one of the coolest aspects of the video. This is the kind of happy accident that gives work originality and makes it more than a slavish copy. Reminds me of Link Wray putting a pencil through the speaker cone of his amplifier.


I kind of want someone to do the same thing with a "NON-neural network" Picasso filter, like the ones in Photoshop and similar image editing programs. I want to compare how much the neural network's understanding of Picasso's style adds to the work (I imagine it's a lot, because this looks incredible).


I'd like to see different styles personally. This generation of deep style will keep too much of the shape to be a picasso style.

There is no decomposing and superimposing points of views because, well, the data to do it it's just not there.


A whole new debate about copyright is around the corner.


Surely this comes under fair use because of modification?


Yeah, it's hard to think of how this could not be an example of transformative use but I'm sure I'm not legally creative enough to assert that:

http://fairuse.stanford.edu/overview/fair-use/four-factors/#...

In a 1994 case, the Supreme Court emphasized this first factor as being a primary indicator of fair use. At issue is whether the material has been used to help create something new or merely copied verbatim into another work. When taking portions of copyrighted work, ask yourself the following questions:

Has the material you have taken from the original work been transformed by adding new expression or meaning? Was value added to the original by creating new information, new aesthetics, new insights, and understandings?


I see this as similar to colorizing a black and white movie, so maybe not.


Serious question: how is this different than one of the many photoshop filters that could be applied iteratively to each frame?


There likely isn't a Picasso Photoshop filter that can match the fidelity of the style transfer used in the video. (I checked; the best you can get is a macro to automate faux-cubism)


ok, if there was, would there be a difference?


"Poetry is what gets lost in translation", "Art is what gets lost in machine learning".

I think it's interesting that it's possible to create basically filters from existing images, but then applying those filters to large amounts of images (like in this movie) quickly loses the novelty effect and is just as boring as any photoshop or gimp filter became in the 90s after seeing it 3 times.

When I look at Picassos actual pictures, I am astonished and amazed with every new one I get to see. With these pictures, I get more and more bored with every additional image.


One of the most striking things missing is Picasso's play on perspective. Often his paintings would look at the subject though multiple points of view at the same time. Or he would break apart shapes, reorient them, and put them back together to get at some underlying idea.

Watching Dave jog along the perimeter of Discovery One in perfect perspective sort of undermines the whole effect. Even though the images are painted over with Picasso-like textures, colors, and shapes, it doesn't really _look_ like the real thing. That said, even if I think it falls short of capturing exactly what makes a Picasso a Picasso, I still think it's pretty cool.


Cool.

It needs some kind of averaging with nearby frames (or whatever), to avoid the constant flicker in areas of more or less solid color.


Given the movie being represented, the constant flicker is thematically more appropriate.


It isn't appropriate for cubism though, which is an attempt to render multiple perspectives to a single point of view. In this case time is the only extra dimension available, so it is a missed opportunity. But if the goal isn't a cubist rendering, just an interesting filter - mission accomplished.


Neural style transfer is extremely fun to play with.

If you have a system with a recent-ish graphics card (I'm doing fine with my GTX 970), put a linux on it and check out the many GitHub projects that implement this stuff (some of the tools will only work on linux).

It's a great way to start learning about deep learning and GPU based computation , which are starting to look like very good things to have on your resume.

Plus, you get to make cool shit like this that you can actually show to your friends. I'm getting more interested in the text generation stuff as well too - I'd love to make a Trump speech generator :-)


Can someone knowledgeable estimate, how far are we from rendering this in 60 frames per second? Can't wait to try it as a post-processing layer in game development.


A paper called Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks (http://arxiv.org/abs/1604.04382) uses a different approach with similar results and runs about 500 times faster giving 0.25M pixel images at 25Hz. It also seems to result in less randomness from frame to frame https://www.youtube.com/watch?v=PRD8LpPvdHI

Code at https://github.com/chuanli11/MGANs


Thanks a lot! That's exactly the kind of comment I love HN for.


Not happening in this lifetime at a meaningful image resolution. The style-transfer implementations require very large amounts of GPU processing.

If you pay attention to the video, the images are heavily upscaled.


Awesome. Just the black monolith should stay black :-)


You just pointed out the biggest limitation with this. Even if it can apply the style, that's all it is doing. It doesn't understand the context or content of the film. Everyone knows in the first fifteen minutes of the film of the importance of the monolith and how imposing it is, but the algorithm doesn't. Even image recognition couldn't help you there. In visual art where filters and different media are used, it is rare to non-selectively apply them. Computers with cognition and creativity are a long, long time away. That is why I can't stand people who are scared of some skynet scenario; even if AlphaGo mastered Go or Deep Blue mastered Chess, real-life problems are not so strictly defined and require much more intelligence and insight.


"Oh my God, it's full of layers."


Would be interesting to see if they could reduce the temporal noise without compromising the overall effect.


I wonder if you could extract a 3D scene from the video and somehow style all the textures rather than styling the frame. Then reproject back and use that to style the frame.


Not trying to over-state my qualifications to make the following claim, but I'm pretty sure Kubrick would have hated this. And, as such, have it destroyed.


Is it just me, but have all forms of art simply melded with self-promotion? (Melded in the sense found in the movie "The Fly.")


I wonder how long it takes to render each frame.

Eventually with fast enough GPUs you could render a video game in this style, now that I would like to see.


This is amazing. That said it doesn't have the distorted perspective I think is a hallmark of Picasso's work.


His http://bhautikj.tumblr.com/tagged/drumpf is much better. Donald Drumpf as sausage


So basically you take someone else's work. Run it with some content (someone else's work also), post it, and wow innovation.

Actually in the last year myriads of similar things were created, and this is simply boring.

This is as interesting as a random tumblr reblog. May be curious, but lacks any sense of achievement or originality.


If you read up on research into innovation, you'll find that much of it does hinge on appropriating ideas from elsewhere. It's more like bacterial gene transfer than procaryote sexual reproduction, and very little like random point variations.

John Holland, W. Brian Arthur, and Doyne Farmer (all associated with the Santa Fe Institute) have done much interesting work. Farmer talks of Theodor Wright, and his studies of aircraft manufacturing improvements during WWII, though much of Farmer's discussion seems to borrow heavily from Wright's browther Sowell, a geneticist looking at genetic drift and fitness landscapes. Quincy Wright's study of war also seems apropos. Interesting family.


wow I can write a for loop for the frames of a film and call the google open sourced deep learning style transfer program.

INNOVATION!




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: