Hacker News new | past | comments | ask | show | jobs | submit login

Is it possible to use the "deep dream" methods with a network trained for audio such as this? I wonder what that would sound like, e.g., beginning with a speech signal and enhancing with a network trained for music or vice versa.



We tried this but with less success than what wavenet did. https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2...


There is a link to examples at the end


Interesting! So if I understand correctly, much of the noise in the generated audio is due to the noise in the learned filters?

I assume some regularization is added to the weights during training, say L1 or L2? If this is the case, this essentially equivalent to assuming the weight values are distributed i.i.d. Laplacian or Gaussian. It seems you could learn less noisy filters by using a prior that assumes dependency between values within each filter, thereby enforcing smoothness or piecewise smoothness of each filter during training.


Yes. Working on some different regularization techniques.


The piano stuff already seemed like 'dream music', as did the 'babble' examples. I found myself terribly frustrated by how short all those examples were. I wanted lots more :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: