Does Diversity Trump Ability? An Example of the Misuse of Mathematics [pdf]

nostromo · on Sept 12, 2014

I was shocked to read the original article. I can't believe that such overarching conclusions about human nature were drawn from a little toy box computational experiment. You might as well publish a psychology paper after a weekend playing The Sims.

http://vserver1.cscs.lsa.umich.edu/~spage/pnas.pdf

kenjackson · on Sept 12, 2014

It also seems to defy common sense. How many top ACM programming teams randomly pick programmers from the college? How many great bball teams are randomly picked from players? There are few activities, where excellence is easily measured, where randomly picking participants results in superior results.

johnbender · on Sept 12, 2014

It's important to realize that common sense is never a replacement for a proof or emperical result. That is, common sense is not sufficient to refute the claims made by the original paper.

The power of science is that it often defies reasoning and in so doing provides new insights into how things work.

tbrownaw · on Sept 12, 2014

But a clearly absurd or nonsensical result should be enough reason to double-check that the proof and results are actually valid.

Common sense may have a somewhat low weight for evaluating how probable some scientific result is, but it doesn't have zero weight. And neither does the possibility of experimental error.

roguecoder · on Sept 13, 2014

You have it exactly backwards: all results should be double-checked and independently verified, especially the ones that confirm to common sense due to the pressure of confirmation bias.

waps · on Sept 13, 2014

From the original paper :

Lu Hong†‡§ and Scott E. Page¶ †Michigan Business School and ¶Complex Systems, University of Michigan, Ann Arbor, MI 48109-1234; and ‡Department of Finance, Loyola University, Chicago, IL 60611

Now that "Business school" part, that's a good reason to recheck the conclusions.

Heh, I'd like to pretend that a rebuttal to business school ideas will actually change them, but we all know how this will play out.

The idea of diversity is imposed from above, and no-one will care about this. Politics is the origin of the diversity idea, nothing else.

bobthechef · on Sept 13, 2014

Are you being intentionally facetious? How could science defy "reasoning"? Is science some mystery cult that is shrouded in obscurantism and lack of intelligibility? Or poor choice of word perhaps? Science is most definitely about reasoning and reasoning properly.

You must realize that proof and empirical result are never a replacement for common or good sense. The very attitude that proof is self-sustaining is as silly as the idea of being able to pull yourself up by your own bootstraps. I encourage you to read Duhem's "German Science" in which he describes exactly the silliness of "science" when it is divorced from common sense. It's not magic, my friend. A man of little good sense will lack a sound place to begin reasoning from. Indeed, a man completely devoid of common sense can only begin by reasoning from arbitrary postulates. However, no amount of rigor can ever make up for the initially poorly chosen postulates. Theorems are only as true as their axioms. As Pascal once said, principles are intuited, propositions inferred.

Common sense is not infallible, of course, and certainly science can help us refine it. But common sense is the necessary starting point of all science (the ancient Greeks talk about doxa and endoxa). Without it, there would be no starting material for your empirical science or your proofs. There would be no interpreting of the results of your experiment or any understanding of what the experiment should consist of in the first place.

Of course, you must also be careful to understand that not everyone possesses a good deal of common sense even though they may appeal to it. Common sense is not the same as popular opinion.

Science isn't a magical thing that is somehow divorced from "everyday reality" or some gnostic religion that tells you how it really is without being somehow derived from experience.

calibraxis · on Sept 13, 2014

To be fair, science does gain its (narrow, but useful) power from defying common sense.

(For instance, people like Aristotle IIRC explained why things fall to the ground: they go back to their natural places. Ok, that's common sense; it makes sense to enough people until they let themselves get puzzled by it. Galileo's one person who explored the weirdnesses it entails; and Newton couldn't really accept gravity's occult non-mechanical properties that the evidence pointed at.)

But I agree that's one thing, and the geek-culture's uncritical fascination with the scientific enterprise is yet another. (Which probably has a bad effect on who gets to become scientists.) Thanks for pointing out Duhem's book; will try to look at it.

spinlock · on Sept 13, 2014

> Are you being intentionally facetious? How could science defy "reasoning"?

Science often defies reasoning. Think about the history of our understanding of gravity. Newton was ridiculed when he proposed his model of gravity. And, Einstein was considered a quack for years until people understood what he was talking about.

SnacksOnAPlane · on Sept 14, 2014

There's a crucial difference between defying common belief, which great scientific breakthroughs do, and defying reasoning, which they don't do. Reasoning is what shows that the old theory did not have the explanatory power of the new theory.

DanBC · on Sept 13, 2014

Moneyball shows us that all that knowledge and experience used to pick a baseball team was wrong and that people should ignore their "common sense" (which is usually just another word for "cognitive bias") and should use hard data.

In the absence of hard data just picking a team randomly froma pool of qualifying players seems to me like it would be as good as pickping players based on the team-picker's experience of the game and "cultural fit" and other bullshit.

scythe · on Sept 13, 2014

>Moneyball shows us that all that knowledge and experience used to pick a baseball team was wrong and that people should ignore their "common sense" (which is usually just another word for "cognitive bias") and should use hard data.

Nothing of the sort. Moneyball showed that an actuarial method was better than other methods in picking a baseball team. This does not mean that the previous methods were no better than random chance. In the case of baseball there was a large corpus of data which had good correlations to players' future performance. That's a rather unique situation.

In this case assuming that whatever previous methods are immediately wrong was on a much weaker basis: just the dataset from one particular toy challenge. A comparison with Moneyball is misleading.

kenjackson · on Sept 13, 2014

Moneyball is a totally different matter. Moneyball is simply saying the metrics used in the past were wrong.

Let me ask you this. Who do you think would win a chess match. The top 5 rated players in the world on one team, our five random adults chosen from the population who knows the game of chess?

Furthermore I'm dating their theory need not be tested with a poor computational model. You can easily test it with real people with problems to solve.

Lastly you add the caveat of "with no hard data". The paper makes the EXACT opposite claim, which is, if you HAVE hard data you should NOT pick the top performing individuals but rather sample randomly to create a team.

SnacksOnAPlane · on Sept 14, 2014

But you said "qualifying players", which implies that there is some qualifier. What is that qualifier?

And Moneyball didn't show us that all knowledge and experience was wrong. We didn't all of a sudden start recruiting marine biologists to become baseball players. The old "common sense" still stood. It's just that resources were reallocated in a very technical, wonkish way.

It's not like someone would look at you like you were crazy if you said "hey, we should pay more attention to on-base percentage than we do now, and maybe we could win with lower salaries". It was, at best, a minor revolution in baseball asset allocation.

amathstudent · on Sept 12, 2014

As someone who knows it intimately, I would like to say that this kind of thing is endemic across the social science literature. What is frustrating is that only very few people seem to have the ability to understand why it is not sound.

kitsune_ · on Sept 13, 2014

Because I didn't have any money after high school I decided to get a degree where I could work part time. I had been programming for quite some years but had an interest in media and politics. So I started to study media sciences at the largest university here in Switzerland. After a month I realized that I had made a huge mistake. There was a class called 'statistics in social sciences' and the assistants had trouble with basic stuff like calculating the mean, median, avg etc. They even celebrated their mathematical illiteracy to a certain degree. On top of that, much of the writing in the standard text books could be reduced to meaningless tautologies once you deconstructed the complex phrases a bit. I dropped out very quickly after that. The whole experience left a very sour taste in my mouth.

jev · on Sept 12, 2014

Probably because the mathematicians don't seem to have the ability to explain why it is not sound without devolving into obscure symbols and jargon.

dxbydt · on Sept 12, 2014

If this was html & written to the point instead of stuffed in a pdf & written up in this roundabout manner, it would be on top of HN with 100+ votes :) I read the whole pdf but here's the TL:DR; anyways -

Randomization improves algorithms. This is so obvious to CS folks its taught in basic cs101 - the ones i'm familiar with are where you pick a random pivot element for quicksort, or say the one where you draw a circle inside a square & pick n random points inside the square, four times the number of points inside the circle will equal pi as n becomes large, or you resort to the miller rabin test for primality when you are doing those rsa calculator type problems where you pick keys for encryption, which will randomly sample some number of possible witnesses and call the number prime if none turn out to be witnesses, or using monte-carlo methods to compute integrals for functions for which no closed-form formula exists etc....there's like tons of examples where you introduce a little bit of randomness & it'll speed up your algorithm.

So this Page+Hong wrote a paper where they want you to randomly hire employees ( who they call problem-solvers or agents ), because diversity trumps ability ?! So if you have 1000 applicants, instead of hiring based on some metric like ivy-league/test-scores/IQ/github/whatever, you just hire randomly, because the randomness introduces diversity which trumps ability ?!! To prove this nonsensical point, they introduce an artificial math problem where agents proceeding randomly obtain the right answer, & not doing so gets you stuck. Ergo, diversity > ability. Sheesh.

yummyfajitas · on Sept 12, 2014

This is in Notices of the AMS, which is about as mainstream (together with SIAM Review) as mathematical publications get.

Randomization improves algorithms.

This is a contentious claim. Random choices are actually very rarely the best - they are often good enough and versatile, but they are rarely the best.

For example, consider Monte Carlo integration. You get O(N^{-1/2}) convergence. If you use a deterministic set of points explicitly designed to have low discrepancy (aka "Quasi-Monte Carlo"), you can get O(N^{-1 + logarithmic stuff}) convergence.

http://www.chrisstucchio.com/blog/2014/adversarial_bandit_is...

Eliezer Yudkowsky also wrote a great critique of this issue, though I can't find it right now.

zwegner · on Sept 13, 2014

I believe the article you're referring to is: http://lesswrong.com/lw/vp/worse_than_random/

Choice quote: > As a general principle, on any problem for which you know that a particular unrandomized algorithm is unusually stupid - so that a randomized algorithm seems wiser - you should be able to use the same knowledge to produce a superior derandomized algorithm.

An interesting counterexample to this, an example which to me illustrates a very powerful aspect of randomness, is the monte-carlo revolution in computer Go (AI for the ancient Asian board game). For years, computer progress stagnated, as the techniques were mostly focused on encoding human knowledge into code. While most of this human knowledge is generally correct, it introduces significant bias in the way positions are evaluated. The way these rules-of-thumb interact is very hard to predict, and tree search algorithms are quite good at finding positions that are incorrectly evaluated. Because of this, the worst-case behavior of an evaluation function is much more important than it's best-case or average behavior.

Computers started making lots of progress when a new technique was used: rather than try to evaluate positions by a large set of heuristics, they were evaluated by playing random games. Go positions are quite hard to evaluate from simple rules of thumb, and this random game approach gives a much more balanced, long-term view of positions. And most importantly, there is much less bias.

Obviously, an entirely random game, with a uniform distribution over possible moves, is easy to improve upon. But computer go programmers noticed an interesting phenomenon: while certain types of knowledge incorporated into the random move distribution (to make it "more intelligent", as judged by a human) were helpful, others were not (even after taking into account the computational cost of adding the knowledge), and it wasn't always clear why. The same observation about heuristic evaluation noted above applied: having a balanced distribution of move choices, with a reasonable probabilistic lower bound of effectiveness, is more important than making an intelligent choice that is usually correct, but has unpredictable, extreme worst-case performance.

So we see that randomness does have an important property: it avoids the downside of "knowledge" that generally seems correct but can go horribly wrong in unexpected ways.

I don't know of a good writeup of this phenomenon. My understanding of it is mostly assembled from following informal discussions on the computer-go mailing list for several years. In a quick search through my gmail archives I can't find much on the subject, but here's an interesting post about related topics in the computer chess world (that incidentally doesn't talk about randomness, but illustrates well the benefits of avoiding bias): http://www.talkchess.com/forum/viewtopic.php?topic_view=thre...

yummyfajitas · on Sept 13, 2014

I don't know a lot about this Go example, but using a uniform distribution for the evaluation function sounds surprisingly similar to another phenomenon I observed.

Suppose you have a linear evaluation function - h(x) is the value of something. Suppose also the coefficients of h are all positive. Then you'll be right 75% of the time (averaged over all possible h, drawn uniformly from the unit simplex) if you just approximate h=[h1,h2,...] by u=[1,1,...,1].

http://www.chrisstucchio.com/blog/2014/equal_weights.html

So I agree with this claim - uniform distributions are fairly robust to errors. But I don't think that's particularly related to randomness - Monte Carlo is only needed to integrate the distribution.

It's also worth noting that adversarial situations (like Go or Chess) are considerably different than most other cases. In a true adversarial problem, there is no probability distribution - the opponent is omnipotent. The purpose of randomness is simply to reduce the power of the adversary's intelligence - in a completely random world, intelligence is useless.

zwegner · on Sept 13, 2014

That's not too different from a pretty old observation in the chess world: the presence/absence of an evaluation term is more important than the weighting given to it.

> So I agree with this claim - uniform distributions are fairly robust to errors. But I don't think that's particularly related to randomness - Monte Carlo is only needed to integrate the distribution.

Ah, that's an interesting distinction, thanks. I'll have to think about this some more. But given a situation where exact integration is intractable (like chess or Go), I'm not too sure what the difference really is, because it is those cases (on first thought) where the uniform distribution is useful--if you can see to the end, you don't need to care about bias, right? I mean, "randomness" in the strictest sense is not really necessary; all these programs I speak of used deterministic pseudorandom generators of course. It's really just about ensuring lack of bias given finite sampling. I'm happy to hear your take on it though--you definitely seem to have a lot more knowledge of math/statistics/etc. than I do.

(That does remind me of another fascinating tidbit from the Go world: programmers noticed that using a low-quality PRNG, like libc's LCG rand(), produced significantly weaker players than more evenly-distributed PRNGs, even though it would seem that playing lots of random games of indeterminate length (with the PRNG called at least once per move) would not correlate at all with the PRNG's distribution.)

The adversarial-or-not issue is also good food for thought. I'm not convinced that it explains much in this case, though, since I believe most of these observations were made by playing computer-computer games with each program using very similar algorithms, or with old hand-tuned programs against the newer Monte-Carlo based programs.

yummyfajitas · on Sept 13, 2014

But given a situation where exact integration is intractable (like chess or Go), I'm not too sure what the difference really is, because it is those cases (on first thought) where the uniform distribution is useful--if you can see to the end, you don't need to care about bias, right?

Put it this way - suppose I can cook up a deterministic quadrature rule, e.g. quasi monte carlo or an asymptotic expansion. I assert that the quasi monte carlo will work just as well as monte carlo, probably better if convergence is faster.

If I'm right, this is a situation of "yay for uniform distributions". If I'm wrong, it's a "yay randomness" situation. It's nice to know which situation you are in - if I'm wrong, there is no point cooking up better deterministic quadrature rules.

Incidentally, LCG is known to be useless for monte carlo due to significant autocorrelation. So it's quite possible that people using LCG are incorrectly estimating their evaluation term.

Also for me, it's nice to know these things just for theoretical purposes and to enhance my understanding.

wisty · on Sept 13, 2014

Yeah, but having some random noise still often helps.

4bpp · on Sept 13, 2014

> Randomization improves algorithms.

The better intuition is "randomisation fixes algorithms that get stuck for superficial reasons" - if you wrote that instead of the diversity thing as the intuitive conclusion of the Hong and Page theorem, you could get away with it in a theoretical CS survey paper or introductory textbook. The set of assumptions they make about the different \Phi would be fairly reasonable, for instance, if we were talking about a number of different flawed heuristics for a search problem.

sopooneo · on Sept 13, 2014

I have not read the circle-square theorem, but surely you are leaving something out. With a large n, many points will fall inside the circle, and four times that quantity can not logically get closer and closer to 3.14

dxbydt · on Sept 13, 2014

you have to divide by n.Here -

     def pointInsideCircle = {val (z,w) = (0.5,0.5);val (x,y) = (math.random,math.random); if ((x-z)*(x-z)+(w-y)*(w-y) < 0.25) 1 else 0;}
     (1 to 1000).map{x=> pointInsideCircle}.sum*4/1000.0
     scala> 3.132
     (1 to 1000000).map{x=>pointInsideCircle}.sum*4/1000000.0
     scala> 3.141612

So as n goes from 1000 to a million, your pi accuracy has improved from 3.13 to 3.1416. As Chris Stuccio pointed out elsewhere on this page, this thing converges O(N^{-1/2} ie. very slowly.

You can try Buffon's needle if you want something much faster.

_j5l3 · on Sept 13, 2014

It's 4*the proportion of points that fall in the circle.

jev · on Sept 15, 2014

I also read the paper. I didn't have any trouble understanding it. The point is that it can be explained 10x better in 10x less space, just like you did.

revelation · on Sept 12, 2014

Have you read the linked paper and the one its discussing?

I find the critique vastly more readable.

amathstudent · on Sept 12, 2014

I think that's a very important point.

Indeed, when you try to explain something simply to people, they often won't believe that it's really that simple, since there's all those symbols!

bayesianhorse · on Sept 12, 2014

In finance it has long been known that diversity in a portfolio can often trump the performance/ability in a particular stock.

Turns out, you can see portfolio theory popping up outside of financial markets, not only in assembling a team, but also advertising campaigns, ant colonies and bacterial colonies.

And yes, I know that the linked article debunks a paper that abuses math... It's just that portfolio theory is at least an analogy to understand why diversity can trump ability. Within reason.

Guthur · on Sept 12, 2014

In my opinion the reason "portfolio theory" works in the cases that you mentioned is that it most perform well across a time series within a highly dynamic environment. But if one is allow to pick an optimal team for each discrete problem then it will surely outperform.

In my mind diversity will help when you have little to no a priori knowledge and then for can not predict what attributes you will need.

bayesianhorse · on Sept 13, 2014

Have you never seen a software development project which ran into unanticipated problems?

What about "knowledge work" is not a highly dynamic environment?

ohsnap · on Sept 12, 2014

Portfolio theory does not work here. Totally different concepts. MPT is about minimizing risk - not trumping the performance of a stock.

bayesianhorse · on Sept 13, 2014

MPT is not about "minimizing the risk". It's about minimizing risk for a desired reward. In an abstract way of speaking, selecting team members is very similar.

roguecoder · on Sept 13, 2014

Also the empirical studies from MIT about group problem solving. They have observed that diversity contributes more than the abilities of the individuals. It's not that the idea that diversity trumps ability in an unpredictable situation is wrong: it's that the original mathematical argument doesn't demonstrate that.

4bpp · on Sept 13, 2014

The response paper seems to harp a lot on how the circumstance that typically, N_1 and N >> k, precludes any realistic applicability of the original theorem -- but it admits a fairly self-consistent and altogether much more sinister interpretation if you call the different \Phi "groups" (or "races" for maximum creepiness). The assumptions then say something to the effect of "every group's approach to problem solving produces fundamentally predictable results, and no group can be the best at solving each type of problem", which is a meme that in some form has been floating around in diversity activism circles for a long time.

The "mathematical theorem" then simply captures the triviality that under this sort of worldview, you want your team to contain Zorblaxians (sorry, SMBC) because there are some Zorblaxian problems that Zorblaxians have a natural affinity towards, and nobody without such an affinity could possibly make progress on. Since this is not stated explicitly, the statement becomes invokable even in settings in which otherwise a large number of people would raise eyebrows at the smell of exoticist quackery that the idea of "different ways of knowing" exudes.

QuantumChaos · on Sept 13, 2014

Your post captures the problem much better than the linked article.

I would summarize the problem as being that the result is tautological, but the gloss of mathematics gives it the appearance of not being tautological.

If people said "diverse teams are better because people from different backgrounds are good at solving different kids of problems" then that would be accurate, but leave room for debate about whether then assumption is true. If they say "diverse teams are mathematically proven to be better", this would be inaccurate, but give great ammunition to argue that science supports progressive views.

cjdrake · on Sept 12, 2014

Richard Feynman has some relevant thoughts on this subject: https://www.youtube.com/watch?v=IaO69CF5mbY

nether · on Sept 13, 2014

One of the authors, Scott Page, majored in math at the University of Michigan. He also teaches the course "Model Thinking" on Coursera: https://www.coursera.org/course/modelthinking. Perhaps one to avoid...

dxbydt · on Sept 13, 2014

Hey that's a great course. Pls do not avoid. Highly recommend signing up. The reason he wrote this famous paper & the book that evolved from that paper, in his own words - http://vserver1.cscs.lsa.umich.edu/~spage/thedifference_inte...

He says - "progress and innovation may depend less on lone thinkers with enormous IQs than on diverse people working together"

Not too sure about that. At all.Especially not in math, the subject he majored in.

eropple · on Sept 13, 2014

> He says - "progress and innovation may depend less on lone thinkers with enormous IQs than on diverse people working together"

> Not too sure about that.

Why would that not be a thing?

I mean, I am not the smartest person on my team. The list of things I know very little about is long. But I occasionally ask questions that are really obvious to me that reorient the discussion because people didn't think of them--problems that were Too Obvious To See up close, you know? And, similarly, when working in stuff that I do know a lot about, I find myself ignoring things that are to me so obvious and basic that my brain just goes right by them.

Innovation isn't like Civilization, you aren't just generating lightbulbs. Diversity of thought process leads to some inefficiencies, but it helps you reach new global maxima.

jrapdx3 · on Sept 13, 2014

I'm not so sure diverse (divergent?) approaches to a problem are always better than one leader's effort.

For one thing, stand-out genius or talent shows up randomly in populations, it's not really predictable when or where it will occur. But on rare occasion when a natural leader does emerge, suppressing diversity (of problem-solving approaches) is likely the better strategy.

The true maxima of human effort have generally followed this pattern, born of singularity and not diversity. Once an innovation is widely enough known, it attracts a diverse range of followers who improve the idea and nurture its maturation. Diversity is useful for the "aftercare" of innovation, but not its creation.

The other major point is that value of diversity (however defined) vs. uniformity depends on context. In the context of one's home, there may be many ways to arrange furniture in a room, and with few exceptions, one way is as valid as another. By contrast, in a shared household over time diverse opinions are likely to converge to an arrangement using the space optimally. Diversity probably leads to a better workflow solution vs. one person's choices.

OTOH there is not a great diversity of "valid" ways to remove an inflamed appendix, that is, regardless of other considerations, a qualified surgeon is required and procedural diversity is constrained by the anatomical realities. This situation demands uniformity not diversity.

Finally, "diversity" has an indeterminate number of definitions, and applicability of the term is entirely dependent on context. On second thought, that would seem to drain it of specificity or meaning, on that ground, I should have a policy of using "diversity" carefully and sparingly.

HelloMcFly · on Sept 12, 2014

If anyone is interested in better research on diversity and team outcomes, I recommend looking into research on "faultlines" in teams. It is an interesting way of operationalizing diversity and predicting its effects.

Link to abstract: http://amr.aom.org/content/23/2/325.short

I can't find a direct link to the PDF at the moment. That's the original article. There's been a lot of research since testing the theory, typically supporting it (though I'm not 100% up to date on this topic).

Russell91 · on Sept 13, 2014

Am I missing something or is the author of this pdf's proof equally flawed. They assume each function is idempotent and injective ... meaning that it is the identity function. And the proof doesn't follow. A much more natural way to have "fixed" the original proof would be to require that V: X -> R be injective.

4bpp · on Sept 13, 2014

Assuming by function you refer to the "agents'" \Phi, where do they make the assumption that it is injective? I see nothing to the effect in the text, and the functions in the counterexample on page 6 are not injective.

Russell91 · on Sept 13, 2014

Oh, oops. I was reading their statement: This interpretation is incorrect without the additional hypothesis that V (x) is a one-to-one function. as - all of the phi functions must be one-to-one. Looks like they made the same assumption that I would have.

This was bothering me though so thanks for your response.

rlvesco7 · on Sept 13, 2014

There is one positive aspect to all this that I would like to point out.

Most social science does not use math, so it makes it hard to argue against ill-defined concepts and ideas. So the fact that these mistakes were found is actually a positive thing and shows one reason why math is useful. That's not to say math cannot be used to obscure arguments and assumptions, it can, but words can do that too.

pervycreeper · on Sept 13, 2014

That was surprisingly entertaining. Any recommendations for other academic papers in that vein?

nether · on Sept 12, 2014

> Page’s work on diversity has been cited by NASA, the US Geological Survey, and Lawrence Berkeley Labs, among many others

Oops.

im3w1l · on Sept 13, 2014

Has there been any attempt at quantifying the "wishful thinking bias" in social science?

amathstudent · on Sept 13, 2014

Here's something you might find interesting: http://vserver1.cscs.lsa.umich.edu/~crshalizi/weblog/698.htm...

There have are also loads of replication projects, some currently ongoing, of both experimental and statistical papers, with generally dismal results.