Show HN: Fast Style Transfer in TensorFlow

minimaxir · on Oct 31, 2016

Terminology note: there is a difference between Style Transfer and Fast Style Transfer.

Fast Style Transfer takes a very long time to train a style model (and the output models can be somewhat large; the pre-trained models in the Dropbox are 20MB), but the styles can be applied quickly. Fast Style Transfer is the technique that is used by Prisma/Facebook. This repo is the first I've seen that uses TensorFlow instead of lua/Torch dependency shenanigans, and as a result should be much easier to set up. (this code release also beats Google's TF code release for their implementation: https://research.googleblog.com/2016/10/supercharging-style-... )

EDIT: Playing around with this repository, I can stylize a 1200x1200px image on a dual-core CPU on a 2013 rMBP in about 30 seconds.

Normal Style Transfer is ad-hoc between the style image and the initial image; it can only be used for one image at a time, but overall it is faster than training a model for Fast Style Transfer (however, this is infeasible for mobile devices).

gedy · on Nov 1, 2016

Thanks for the details on this - one question is your rMBP using CUDA/Nvidia for the 30 sec render?

minimaxir · on Nov 1, 2016

No, which is why I specified dual-core CPU.

The README cites ~100ms with a Titan X using CUDA, although speed depends on image size.

dharma1 · on Oct 31, 2016

Nice work! Have you seen this - could help reduce some of those checkerboard artifacts

http://distill.pub/2016/deconv-checkerboard/

minimaxir · on Oct 31, 2016

Playing around with the tool a bit, supersampling seems to work pretty well to avoid those artifacts. (Thanks to the speed of Fast Style Transfer, this is feasible.)

EDIT: There's a catch. The checkpointed models are calibrated for smaller images, so you'll get better styles with smaller images. Larger images tend to just have images repeated: https://twitter.com/minimaxir/status/793161755992002560

logane · on Oct 31, 2016

I hadn't seen that, thanks for the link - it looks useful! Right now the kernel size is not divisible by the step size - I'll try changing the transformation network.

hacker_9 · on Oct 31, 2016

Utterly amazing what people are achieving with neural nets. The idea that 'style transfer' can be fit into an algorithm is slightly blowing my mind right now.

The jumping fox video does looks a bit 'off' though, I think because the animation is kept the same and so it ends up looking too realistic for that style. Still this is early days!

pavlov · on Oct 31, 2016

To get an idea of how style transfer works, it can be useful to look at some less successful examples. Here's one from the same git repo:

https://github.com/lengstrom/fast-style-transfer/blob/master...

(Original photo: https://github.com/lengstrom/fast-style-transfer/blob/master... )

With the "Great Wave" painting as the network's style input, the limitations of the technique become more apparent. It's clear that a human painter would never render the Chicago skyline in this way: there are incongruent little waves on buildings' edges and all over the sky.

The antennas on top of the tallest tower are particularly revealing. The neural network just sees an area of higher local contrast, and has continued the same pattern that was applied in the sky at the top-right of the antennas but with more contrast applied. This doesn't make any sense for what's supposed to be a painting.

There's no intelligence here, "just" pattern matching that can do a brilliant illusion of creative variance on the right kind of content. ("Just" in quotes because it's still a great achievement.)

pizza · on Oct 31, 2016

Now we need some kind of object recognition-enhanced version of style transfer that learns constraints on what "makes sense" given "sensible" labeled/captioned training examples!

ebalit · on Nov 1, 2016

It has been done with manual segmentation [1]. And the results are mind blowing. There also a lot of work done on segmentation with neural nets [2, 3], so I wouldn't be surprised to see someone implementing this idea in the near future.

1: https://github.com/alexjc/neural-doodle 2: https://arxiv.org/pdf/1605.06211.pdf 3: http://mi.eng.cam.ac.uk/projects/segnet/

santaclaus · on Nov 1, 2016

There has been a lot of cool image manipulation/transformation/synthesis work coming out over the past couple of years using NNs and such. I'm curious if any of these techniques have started worming their way into products? Will new effects in Photoshop (or whatever) get better over point releases as companies train up better and better NNs?

lhnz · on Oct 31, 2016

It would be nice if this algorithm could be created in a more localised form, so that you could take a photo and apply the effect with different intensities as if you were using a brush.

Merely because sometimes it gets things wrong, and I think a normal human could correct things if they had more control over the process.

_0ffh · on Nov 1, 2016

Can surely be done with the original neural style transfer algorithm. You'd just have to use a soft mask over the picture when calculating the content loss function.

Don't know if there's an analog way for fast style transfer.

ebalit · on Nov 1, 2016

As I mentioned in another thread, there is some work on this: https://github.com/alexjc/neural-doodle

eitally · on Oct 31, 2016

Relevant Tensorflow/Brain blog post: https://research.googleblog.com/2016/10/supercharging-style-...

pizza · on Oct 31, 2016

So you can transfer style from image A to image B. What I want to know is, can you use style transfer to "amplify" image A's style to, say, 500%?

gbrits · on Oct 31, 2016

presumably this could be done by enlarging image A image by 500% and getting and cropping a relevant 1/25th of the image?

spacehacker · on Oct 31, 2016

Nice work. I'd be curious how this technique compares with other fast style transfer methods, if any exist.

visarga · on Oct 31, 2016

Results look pretty nice. Congrats!

rubyfan · on Oct 31, 2016

This is ridiculously cool.