Interesting! So if I understand correctly, much of the noise in the generated au...

Interesting! So if I understand correctly, much of the noise in the generated audio is due to the noise in the learned filters?

I assume some regularization is added to the weights during training, say L1 or L2? If this is the case, this essentially equivalent to assuming the weight values are distributed i.i.d. Laplacian or Gaussian. It seems you could learn less noisy filters by using a prior that assumes dependency between values within each filter, thereby enforcing smoothness or piecewise smoothness of each filter during training.