Hacker News new | past | comments | ask | show | jobs | submit login
Analysis of Over 2,000 Computer Science Professors at Top Universities (jeffhuang.com)
115 points by WestCoastJustin on May 26, 2014 | hide | past | favorite | 31 comments



It strikes me that you can use faculty publications and PageRank algorithm to assign a score to each faculty member. Then you can actually find out which university has the strongest CS department. You can extend this to may be other fields to rank universities for each major and replace things like Patersons guide and other university ranking systems which are rather arbitrary.



The h-index seems like much weaker metric than PageRank. For instance, consider 10 professors you decides to game this. They will create 100 fake papers and cite each other in all. Now suddenly they would have h-index of 100!

PageRank is recursive metric in graph. In above case, if above 10 professors don't have any other links pointing to them, they don't get ranked higher. PageRank was designed to eliminate scenarios like above.

It's amusing that Google Scholar doesn't use PageRank prominently instead of these weaker metrics.


Aren't there areas of scientific research where people doing genuinely valuable research don't get cited at all except within a closed community? I'd imagine e.g. that a handful of researchers all studying some obscure animal species would have a high h-index but a very low PageRank.


An experiment that did just that a few years ago:

"Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting" by Delgado Lopez-Cozar, et al.

http://arxiv.org/abs/1212.0638

[edit: grammar]


"weaker" metrics can be useful, because you can understand it better. If I tell you a PageRank of an author is 3.452. That means nothing to me. If I tell you somebody's h-index is 18. That means they published 18 papers that got 18 citations each. The h-index is useful for comparing people in the same field. There is also a metric for journals called impact factor.

http://en.wikipedia.org/wiki/Impact_factor


You can even do something similar at the city level and see which city is the most influential in Science (or physics in this case): http://www.nature.com/srep/2013/130410/srep01640/full/srep01...


I wrote a quick script to generate a list similar to (but not as deep or well-annotated as) Prof Palsberg's.

https://gist.github.com/shazeline/d9881d06be31a59a93d3

It's a BFS for Google Scholar seeded with Herbert Simon.


PageRank was originally designed for this purpose.


Care to elaborate?


The original funding Larry got was from the Digital Library Initiative. Then the web blew up. Maybe I shouldn't say PageRank was designed for citations, but that is where it draws its influence from. Things don't magically pop out of no where. But if you look at the PageRank algorithm, it looks like it is designed for paper citations. As a graduate student, you start a some place and you end up at a different place from where you intended to end up.

My point is that one shouldn't be surprised to see that PageRank would work well for paper citations.

http://www.nsf.gov/discoveries/disc_summ.jsp?cntn_id=100660

https://en.wikipedia.org/wiki/Page_rank


PageRank evolved from Kleinberg's Hubs and Authorities[1], which partly evolved from work in ranking research papers.

[1] http://en.wikipedia.org/wiki/HITS_algorithm


I think warrenmar is referring to the fact a paper's impact can be measured by counting the number of times other authors cite the original paper.

IIRC, Larry and Sergey used that idea as the conceptual framework behind PageRank. Indeed, they were right: the relevance of a search result can be increased tremendously by indexing as many pages as possible and then calculating how many people are linking to said page.

They don't do it like that anymore but that's the basic idea.


I wonder what the equivalent of link farms and spam in such a system would be ?


Self citation and publishing very similar articles multiple times. Both are already problems, since they work well to game many currently used citation metrics.


The Information Systems literature.


It is remarkable that top 4 IITs combined make up 5.2% of undergrad degrees of CS professors right after MIT 5.9% and ahead of Harvard at 3.1%. What is even more impressive is the fact that these four IITs admit around 140-150 students into the Computer Science program every year. It will be interesting to know how many undergrads graduate with CS/EE degrees from MIT and Harvard. Nehru's vision (India's first prime minister) for creating IITs was producing best technical minds for the development of new independent India, what happened instead was IITs attracted the best and brightest in India and became the number one exporter of top technical minds to USA. This brain-drain seemed to have slowed down a bit in the last decade.


I suspect that many non-CS folks go onto CS Phd programs. I have heard that undergrad major at top Indian schools reflecting signaling (certain majors in certain years get the better students, independent of student interest) which is why major isn't a reliable indicator. (Not so much in the US too)

Having worked with many folks from IIT, it is a fantastic source of raw intelligence. I view it more similar to CalTech. Coming from a rich family, being a star athlete, or student body leader doesn't help. It's all about the academics.


I suspect the opposite is true. Mostly IIT CS/EE majors get into the top CS PhD programs. Many non-CS folks, sure change professions to being programmers but rarely graduate with a stellar PhD from a top 5 (10?) computer science program. Most of these IIT grads have to apply to graduate school with 6 semesters of work. Without a CS/EE degree, and a recommendation of a top IIT CS prof (who has a good reputation of sending top students regularly to graduate programs) whose students get pattern matched by the selection committee's from top 10 CS schools, it is next to impossible for a non-CS IIT grad to compete with a typical top 10 graduating CS grad (9.4 + GPA, probably some Math/Phy/Chem olympiad medal or ICPC finalist/winner, almost perfect GRE scores, may 1/2 ACM conference papers). Please note that I emphasize the top 10 CS schools part quite a bit because the game of finding an asst prof job is heavily rigged and top 10 CS school grads have a major advantage over others here.


Thank you for clarifying. In hindsight I think you're right. There is a higher burden on the foreign student. An MIT EE or applied math undergrad with a lot of CS classes can get into MIT's Phd program without the CS degree, while it's tougher from someone unknown to the faculty.


"It will be interesting to know how many undergrads graduate with CS/EE degrees from MIT...."

Before the dot.com crash, it was ~ 40% of the undergraduates for more than a couple of decades, ~ 400 students. After, enrollment dropped by a bit more than half, and as of late it's climbed back up a great deal.

Is that 140-150 students total for those 4 ITTs, or for each?


Until the late 90s it was about 35 per IIT for CS, so about 140 for the top 4 IITs. After that it was about 50 per IIT, so 200 in all. Double these numbers if you want to include EE as well.


Analysis of Over 2,000 Computer Science Professors at Top US Universities

I was expecting universities from all around the world but this is just the US.


This is an interesting analysis and I can think of two opposing scenarios that might both be important (note that this is purely speculation based on observing people in industry):

1) Having more full professors might indicate a strength in the core math and theories behind CS. This field is much slower moving than the "technology du jour", so continuity and continued thought is important.

2) Associate and assistant professors (in theory younger and more attuned to the latest technology fads) might be what ties the almost real-time advances to the backing theories.

If my hypothesis holds true (and I have no data that proves it does), then balanced CS departments would be most effective. Many times having a broader demographic is advantageous, so this shouldn't be a surprise.


The number of full professors appears to have some correlation to the reputation of a CS department, but for state universities political climate also probably plays a role - few if any departments at Florida have a high ratio of tenured professors. This is true throughout the system and probably reflects policy more than anything else - not the Florida is necessarily a top CS destination.


Over a 100 professors just from the IITs. Incredible.


Yeah, I was thinking that if you added up all the IIT placements, they would be second only to MIT, but perhaps it's not fair to add them up like that, because each one is a separate institution.


and the table doesn't include Kharagpur (and Roorkee, Guwahati) because their percentages are smaller. It is conceivable that if they were all added up, they might even surpass MIT. But as you said, its not fair.


The upward trends in the hiring graph do not show that "there is growth in computer science". If I am hired by a university in 2002, another in 2007 and another in 2012, I will only show up in the graph for 2012 because they are getting hiring dates from current professor data. So that graph does not say much about growth.


Sad to see the University of Waterloo (Canada) not on the list. It would be nice to see its comparison to US schools.


and here I thought they're going to rank them according to the length and bushiness of their beards...




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: