There are technologies to compress deep networks by pruning weak connections. I ...

jmvalin · on March 29, 2019

Actually, what's in the demo already includes pruning (through sparse matrices) and indeed, it does keep just 1/10 of the weights as non-zero. In practice it's not quite a 10x speedup because the network has to be a bit bigger to get the same performance. It's still a pretty significant improvement. Of course, the weights are pruned by 16x1 blocks to avoid hurting vectorization (see the first LPCNet paper and the WaveRNN paper for details).