Keras: Theano-Based Deep Learning Library

jre · on March 29, 2015

I know of 4 projects for deep learning based on Theano.

Keras, Blocks and Lasagne all seem to share the same goal of being more libraries than framework. You can use only one part (e.g. a Layer implementation, training algo) without having to pull in everything :

https://github.com/bartvm/blocks

https://github.com/benanne/Lasagne

Then there is pylearn2, which look more like a framework and seems to be a good candidate for becoming the GPU-accelerated scikit-learn :

https://github.com/lisa-lab/pylearn2

I have started using blocks and did some tests with pylearn2. Anybody with more experience want to share the strength/weaknesses of each of those projects ?

dave_sullivan · on March 29, 2015

Fwiw, we're using pylearn2 and blocks at Ersatz Labs. Both use Theano. I'd recommend them, particularly if you are into python.

It's hard to build a good NN framework: subtle math bugs can creep in, the field is changing quickly, and there are varied opinions on implementation details (some more valid than others). Even just learning a single framework requires a good deal of effort.

albertzeyer · on March 29, 2015

I know some more. Some of them are made as libraries, some are just code examples, where you however could extract out some relevant code.

* PyLearn, LISA labs, http://deeplearning.net/software/pylearn2/

* LSTM, http://deeplearning.net/tutorial/lstm.html#lstm

* LSTM, https://github.com/skaae/nntools

* LSTM, https://github.com/JonathanRaiman/theano_lstm

* LSTM, https://github.com/mohammadpz/Recurrent-Neural-Networks

* LSTM, http://christianherta.de/lehre/dataScience/machineLearning/n...

* LSTM, https://gist.github.com/jpuigcerver/9358036

* LSTM + CTC, https://github.com/kastnerkyle/net

* Speech modeling, LSTM, https://github.com/kastnerkyle/speech_density

* FF + RNN, https://github.com/lmjohns3/theano-nets

* FF + RNN, https://github.com/lmjohns3/theanets

Speech: https://github.com/lmjohns3/arrnn-experiment/blob/master/tas...

* FF, https://github.com/benanne/Lasagne

* RNN, https://github.com/pascanur/trainingRNNs

* RNN, https://github.com/pascanur/GroundHog (Razvan Pascanu, KyungHyun Cho, Caglar Gulcehre)

* RNN + CTC, https://github.com/shawntan/rnn-experiment (Shawn Tan)

* RNN + CTC, https://github.com/shawntan/theano-ctc (Shawn Tan)

* RNN + CTC, https://github.com/rakeshvar/rnn_ctc

* RNN + CTC, OCR, https://github.com/rakeshvar/chamanti_ocr, https://github.com/rakeshvar/chamanti3_ocr

* RNN, https://github.com/gwtaylor/theano-rnn

* LSTM, RBM, DBN, https://github.com/kratarth1203/NeuralNet

* RBM, https://github.com/benanne/morb

* Q-learning, https://github.com/spragunr/deep_q_rl

* Deep Generative Models, https://github.com/dpkingma/nips14-ssl

* RNN, agents, “bricks”: https://github.com/bartvm/blocks

* NTM, https://github.com/shawntan/neural-turing-machines/

* RL + CNN, https://github.com/brian473/neural_rl

* DRAW RNN, https://github.com/jbornschein/draw

And this is far from complete, there are countless more examples. Just search on GitHub. I just filtered out the ones which interest me (which at least have RNNs/LSTMs or some other interesting things).

benanne · on March 29, 2015

Nice work! Since you mentioned you're looking for RNNs/LSTMs specifically: the implementation at https://github.com/skaae/nntools is an extension of Lasagne (which used to be called nntools) and will be merged into the library at some point. Hopefully in time for the first release, but we don't know yet if that will be feasible.

elarosca · on March 29, 2015

I would also like to mention my project: pydeeplearn. You can find it here https://github.com/mihaelacr/pydeeplearn. I think it's main advantage is that it uses theano under the hood but the user does not need to know theano at all. The most complete implementations are those of RBMs and DBNs, but I also have CNNs. The library has support for adversarial training, as presented in the paper "Explaining and Harnessing Adversarial Examples" by Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy. I recently also integrated spearmint into the library, so hyperparameter optimization comes in for free.

michaf · on March 29, 2015

How does Keras compare to Lasagne [0], which is also Python/Theano based, and which was used with some impressive results [1]?

  [0] https://github.com/benanne/Lasagne
  [1] http://benanne.github.io/2015/03/17/plankton.html

benanne · on March 29, 2015

One of the authors of Lasagne here! Lasagne is being built by a team of deep learning and music information retrieval researchers. Keras seems to share a lot of design goals with our project, but there are also some significant differences.

We both want to build something that's minimalistic, with a simple API, and that allows for fast prototyping of new models. Keras seems to be built 'on top of' Theano in the sense that it hides all the Theano code behind an API (which looks almost exactly like the Torch7 API).

Lasagne is built to work 'with' Theano instead. It does not try to hide the symbolic computation graph, because we believe that is where Theano's power comes from. The library provides a bunch of primitives (such as Layer classes) that make building and training neural networks a lot easier. We are also specifically aiming at extensibility: the code is readable and it's really easy to implement your own Layer classes.

Another difference seems to be the way we interpret the concept of a 'layer': a Layer in Lasagne adheres as closely as possible to its definition in literature. Keras (and Torch7) treat each 'operation' as a separate stage instead, so a typical fully connected layer has to be constucted as a cascade of a dot product and an elementwise nonlinearity.

Layers are also first-class citizens in Lasagne, and a model is usually referred to simply by its output layer or layers. There is no separate "Model" class because we want to keep the interface as small as possible and so far we've done fine without it. In Keras (and Torch7) the layers cannot function by themselves and need to be added to a model instance first.

For now, all Lasagne really does in the end is make it easier to construct Theano expressions - we don't have any tools yet for iterating through datasets for example, but we do have plans in this direction. We plan to rely heavily on Python generators for this. The scikit-learn like "model.fit(X, y)" paradigm, which Keras also seems to use, only really works for small datasets which fit in memory. For larger datasets, we believe generators are the way to go. Incidentally, Nolearn ( https://github.com/dnouri/nolearn ) provides a wrapper for Lasagne models with a scikit-learn like interface. We may also add this to the main library at some point.

Lasagne is not released yet - the interface is not 100% stable yet, and documentation and tests are a work in progress (although both are progressing nicely). But a lot of people have started using it already, we've built up a nice userbase and a lot of people have started contributing code as well! We're currently aiming to put out the first release by the end of April.

A non-exhaustive list of our design goals for the library is in the README on our GitHub page: https://github.com/benanne/Lasagne