My impression is that these kinds of models need a lot of data to train properly. I made a comment elsewhere in this thread, musing that tree ensemble models could do the same job, as a kind of low resolution quantized approximation. If you have experience in this area of research, I'd love to hear your thoughts on that.