Hacker News new | past | comments | ask | show | jobs | submit login

Reading through documents like this really pains me because I it seems like such interesting work and I immediately want to understand it better, but then realize the time required to acquire the knowledge and experience necessary to understand and apply this technology is so great, that it almost seems like a waste of time. After all, think of all the things one could build in the 2 full-time years it would take to fully comprehend all of this to the point where it's useful in any practical application.



Here's another way to see: What have you been doing for past two years? Now imagine you had started learning this about 2 years ago. You would have been done by now and ready to tackle some of the most interesting problems instead of continue to do same boring stuff you had been doing for past 2 years. In 2016, come back here and look at this comment again :).

PS: For people who are saying you can "apply" DNNs in a day or learn it by a coursera course in 6 weeks - they are only very superficially right. Yeah, anyone can build ML model for a sample training data using tool in the same sense that anyone can compile sample code and have a working app. The problem is that most models don't work the first time as expected. The challenge lies in debugging the model and fix many of N possibilities to make it work. This is what working in ML is all about. It's like usual programming where it takes years of experience to debug the code and make it work for your purpose. The added twist in ML is that debugging is almost entirely statistical. When your model doesn't work, it doesn't work only in statistical sense. Your problem would be essentially that the model doesn't give expected answer this 12% of the time. For this 12% of the time, it doesn't work not because of some wrong "if" condition or misplaced subroutine call. The debugging is almost always statistical debugging - there are no breakpoints to put or no watch to set or not even exceptions. So it takes pretty solid background in statistics and probability to effectively work in ML. And yes, most likely it would take much more than 2 years.


Truly understanding this to the point where you are "caught up" with the field may take 2 years, but one of the big blessings of deep learning is abstraction. You can go from very high level "black box" approaches simply using and following example code from Torch, Theano, or Caffe, all the way to nitty gritty study of the details of various architectures, how to optimize them, and how to apply them.

Watching videos of presentations and reading slides is often much easier than comprehending papers, though ultimately the paper should have much richer detail.

Personal anecdote: Two years ago I just started learning about these things, coming from an undergraduate degree in electrical engineering. Now I am in graduate school for deep learning and AI working to push things forward, one small step at a time. It is totally possible to learn this stuff in a reasonable amount of study, and there are more free resources than ever. Note that I had a full time engineering job until 6 months ago... doing something totally different!


You should be complaining about the computing resources to train 24 hidden layers.


GPUs make a lot of this tractable. Nvidia is actually offering some free compute time for researchers:

http://www.nvidia.com/object/gpu-test-drive.html


Its not as complicated as it seems. 2 full-time year might be required if you want to start in the PhD program for it but not for its applications.


That's pretty comforting to hear. Do you know of any resources off the top of your head to learn this type of thing?


Geoffrey Hinton (a notable deep learning researcher) did a coursera course on neural networks awhile ago. It's over but you can still see the lectures which are very good: https://www.coursera.org/course/neuralnets

Metacademy is also a very useful resource for anything machine learning: http://www.metacademy.org/


torch7 used by facebook and google. Fast becoming the standard industrial neural nets library. http://torch.ch Read the tutorials and you can get something working in a day


"Used by Facebook and Google" ? Citation needed. AFAIK, at least Google has an internal homebrew solution, that automatically scales to large clusters.


Google DeepMind chiefly uses Torch7. I presume parts of Twitter also use it now since they acquired Clement Farabet's startup MadBits.

Facebook's AI research lab has contributed to the Torch7 project (which is unsurprising since it is lead by Yann LeCun, and Torch7 was originally developed in his group at NYU).

I wouldn't go as far as to say it's "becoming the industry standard" though. Caffe and Theano are also very popular.


AMA with Yann LeCun which confirms they are using it. http://www.reddit.com/r/MachineLearning/comments/25lnbt/ama_...


I stand corrected. Thanks for the link. I had somehow missed that AMA; it's fascinating reading. (Hinton's too).


Facebook AI Research uses Torch heavily, and contributes improvements back to the open-source version.


There's not a lot of theoretical background beyond the basics of pattern recognition to get started understanding this stuff. Of course, pattern recognition requires some knowledge of probability, statistics, linear algebra, and vector calculus. There are books on pattern recognition that are rather friendly towards this prerequisite knowledge, though.


I disagree. I believe that the foundation you will learn in statistics, linear algebra, and other mathematics would be worth the effort to any computer scientist. You would enjoy the introduction Machine Learning courses on Udacity or Coursera.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: