Nobody forced us to open-source Lasagne, so I think that remark was a bit unfair...

varelse · on Nov 9, 2015

Actually lasagne is pretty good, I wasn't targeting you (but then you're pretty good at winning kaggle competitions so perhaps there's a connection there, no?)...

I'm thinking mostly of Theano, which, from a performance standpoint, appears to have died the death of a thousand inexperienced cooks in the kitchen. The ~1000x performance regressions that it invokes when a junior data scientist goes off the rails and ends up with a python inner loop amidst GPU kernels is just depressing and seemingly unfixable. Hopefully, TensorFlow will be better if only because it was written in a world now very aware of Pixel's Law.

Mxnet is awesome, but perhaps a little too parameter servery for my personal tastes, and I'm now wondering what the point of CGT is now other than to be Coke to Google's Pepsi. I also think the whole deep learning framework business model just took a torpedo amidships (and not long after the layoffs at Ersatz).

Finally, I had never heard of Chainer until today, thanks! That said, without autograd functionality, the people I work with would probably stick with Caffe + cuDNN.

robbyetor · on Nov 9, 2015

Pixel's Law? what s that? can' t find it on google...

varelse · on Nov 9, 2015

Interesting, it does appear to be gone...

Basically, old NVIDIA marketing material used to state that GPUs double in performance every year or so whilst Moore's Law is slowing down w/r to CPU performance because clock speeds have been the same for a long time

This strictly isn't true because core count, SIMD width and vector unit counts in CPUs have all been increasing. However, from the perspective of a single-threaded C application, this is indeed so. CUDA/OpenCL OTOH automagically subsume multi-core, multi-threading, and SIMD into the language itself so the hypothetical "single-threaded" CUDA app just keeps getting better(tm).

The reality though (IMO of course) is that Intel promises and delivers backwards-compatibility at the expense of free performance beer. In contrast, NVIDIA delivers performance at the expense of 100% backwards-compatibility beer for optimized code (but read each GPU's whitepaper and spend a week refactoring your code per GPU generation and you get both, also IMO and of course experience). Of course, to be fair, if you refactor your code every time they improve AVX/SSE, CPUs are a lot mightier than what Python/Javascript/R usually imply.