Hacker News new | past | comments | ask | show | jobs | submit login

> There must be a lot of genuinely cutting edge research into machine learning,classification,NLP etc going on in there.

Does the research get published or otherwise make it outside of the Googleplex? Serious question.




To be honest... kind of. They release papers but most of the layers at the bottom are Google proprietary. Certainly the source code is proprietary. Hell, even most of the information about GFS is proprietary, despite having a successor. From an academic perspective, their disclosure is kind of crap.


Plenty of academic research papers don't have code either.

"Hadoop was derived from Google's MapReduce and Google File System (GFS) papers."

So apparently the disclosure was good enough to give you that :)

http://www.slideshare.net/jbellis/cassandra-open-source-bigt...

Shows that cassandra is based on the bigtable paper, among others (hbase i believe was as well).

Their disclosure is apparently good enough to get folks going ...

FWIW: In reading 20 years of compiler papers, i've had more luck getting code for papers from commercial companies than academics. It's become somewhat better over time, but plenty of universities still seem completely unwilling to give code to papers.


You're looking at this from an engineer's perspective. Yeah, great, we can build Cassandra -- who cares? Google papers have tons of performance benchmarks -- how is it scientific if none of those are reproducible?

Am I defending academics who can't or won't turn over all their data? No, of course not, but most of them don't try to pretend you can do science and still be proprietary.

P.S. I'm not talking about all companies. Intel and MSR turn over a bunch of stuff. I'm talking specifically about Google.


It's always been interesting to me that someone publishing a paper in computer science (especially someone working at a University) would not want to open source the code that produced those results. It's the best kind of repeatable experiment.

I recently went looking for any code published by Hinton, since I've been doing some neural network research as a hobby, but was unable to find any at all.


Speaking from personal experience, a lot of academics see programming as a hurdle---an unpleasantry that must be overcome as quickly as possible before going back to the real interesting bits.

This tends to result in very poor code, since the people who write it do not practice their craft. And very poor code isn't usually released because people are embarrassed.

I'm certainly generalizing here, but this seems to be the trend as I see it. It is a sorry state of affairs, and I try to do my best to encourage publishing code with the people I work with. (Code reviews are great.)


Actually, Alex Krizhevsky published some of the code they have been using @ http://code.google.com/p/cuda-convnet/

It seems from my experience that generally Hinton and team seem very open to sharing quite abit of their code, data and results.


You'll probably have to contact him (or the graduate student who did the work, realistically), but if you do you'll probably get it.




http://research.google.com

For reference, I googled "research at google".





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: