-Hacking data to reveal the answers to important difficult questions.
-Exploring contentious issues in a neutral & intelligent way.
-Great example of how to attract potential customers by providing valuable information that also gently demonstrates competence related to your service & differentiates your offering from competitors.
I enjoyed reading this, I learnt something, and I thought favourably of the website based on this. No idea how good they are, but that post was a good marketing lesson.
I don't think it is really as neutral as it seems. Match percentage is a very preliminary survey of how compatible two people are, and the article itself is fairly judgmental.
One possible explanation for the low Hindu-Hindu match is that there are many websites dedicated to specifically Hindu dating which means that the Hindus using OkCupid are those who are specifically not looking for a Hindu date.
For other religions, such sites aren't nearly as prevalent.
The matches are based on answers users give to questions (both site designed and user designed).
When the blogger says they don't match, it means:
- Person A: I like X
- Person B: I don't like people who like X
Where X could mean ice cream, god or tooth-brushing. It has nothing to do with people selecting the race or religion of the people they think they want to date.
Yeah, but if Hindu men like ice cream, and I don't want Hindu men, I might not like men who like ice cream either because a) it's what Hindu men do, or b) I don't like people who like ice cream to begin with, and I don't like Hindu men for that reason.
But is there anything you can replace "ice cream" with to make it a workable example? What _real_ questions are you likely to see this kind of Hindu-Hindu antimatching behaviour on?
Sorry are you serious? Why would a Hindu that is pro-arranged marriage be on a dating website? Arranged marriages were a severe form of keeping the caste system in tact but that's not applicable now.
But then again your type of logic can probably explain why Islamic people aren't liked since everyone is anti-terrorism.
If I had to guess, you got down-modded because you were ignoring the premise of the above statement that being a hindu man might be correlated to liking ice cream. You are free to disagree with that but you need to give a reason.
Of course, ice cream wasn't the point and other people gave better examples.
I was going to post a comment directly disagreeing with this, and citing the large number of Jewish dating sites, but these are really only used by people on the more religious end of the spectrum, so it is possible that the effect is more prevalent in the Hindu-Hindu match calculations than in Jew-Jew match calculations. I would posit, however, that this effect, along with the self-selection bias of using a multi-cultural dating website, is almost certainly the cause for the tendency for higher matches towards non-relgious people.
Given that strongly religious people (of whatever religion) had a harder time matching, removing the most religious Jews would improve the compatibility of the remaining Jews.
JDate and similar sites aren't exactly religious. There are lots of secular Jews who would prefer to date or marry another Jew, for ethnic, cultural, or national reasons.
True, I think part of the confusion stems from the fact that you can have "secular Jews," since the Jewish way of life seems to be a bit more than just about religion (not saying the religion isn't an important part of it). If someone were to say "secular Christian" I'd look at them funny.
... which means that the Hindus using OkCupid are those who are specifically not looking for a Hindu date.
Or maybe they are specifically looking for a non-Hindu date. I mean that they might be looking for a uncompromising relation, maybe easier with a person with different roots.
As a single male Asian I am praying that the first row isn't me. But odds are it is anyway :( It certainly conforms with stereotypes about Asian girls...
However, I think one critical component is missing (from what OKCupid posted, not from your analysis). Who sends the first message? For example, we are seeing a lot of green in the black female column. Does this mean that:
a) Black females reply back a lot?
b) Lots of men reply to black females?
These two results would mean entirely different things. The first means that black females aren't picky, the second that men like black females.
As an aside, I can't help but wonder if OKCupid intentionally left the reply rate chart unlabeled and unexplained, knowing that some people would post an analysis on their own blogs and generate some viral goodness.
> ". . . I point this out now so that, below, when we claim that Jewish women are easier to get along with than Christians, you don’t blame us, you blame Jesus."
And that's the first really solid evidence that I've seen shared that zodiac signs are utter BS, which makes me doubly grateful to them. Something to point to when someone says it can't be proved otherwise...
The best evidence against astrology I've heard came from my college astronomy professor. Basically, due to the precession of the earth on its axis, the constellations are visible at completely different times than they were when astrology was invented. So even if it wasn't BS back then, it sure is now.
I think this is probably too biased, as messages from users with a close match percentage are highlighted in your "inbox". The data might not be good enough to discern between "replied to message because it was highlighted" and "replied to message becuase match % was high".
(Personally, I have a feeling that "pretty profile picture" is more influential than "high match %". But of course, it's very hard to measure what one considers physically attractive.)
Maybe they just tend to be more likely to tell people to f off. A reply isn't always a good reply. I would think you would need to look at conversation rates.
This is hilariously interesting. Data mining can always bring good insights if you have good data.
I wonder, though, (a) if most people search for similar people or for different people, (b) how well does the matching percentage really reflect a "liking probability" (ie, there is always some wishful thinking when designing date profiles, and some of that might be counter-productive, ruling out good matches).
While I love the number crunching ethos, and think these guys are giving it one hell of a go - I will go on the record and say that they will ultimately fail because they are operating under many false premises, a few being:
1. People do not change their minds.
By algorithmically matching based on answers to questions, you must assume that those answers are meaningful. And meaningful means CONSISTENT. And people are NOT consistent. Especially on complex issues; in fact there are whole professions (sales (insurance, real estate, car, etc) comes immediately to mind) where success is based on one's ability to get people to change their minds. (That example is offered as evidence, not proof - but proof probably isnt a complicated exercise)
2. Like Thinkers make Great Daters
Assuming you are even GETTING like thinkers, (which is tenuous at best) where is it written that like thinkers make great companions? You don't want to have absolutely nothing in common, (at least a common language is necessary) but it isn't linear and i would be surprised if there were any correlation let alone one that was non-asymptotic. Simply - more like thoughts does not entail more compatibility.
3. Looks aren't EVERYTHING
Just based on sheer observation - this is not the case. It is unfortunate, and it speaks badly about human nature, but most human beings make judgments based on appearances. And this "appearance crutch" is only exacerbated when it comes to looking for someone to lay with. How many single, male, bleeding heart liberals would take a swing at Sarah Palin if given an iota of an opportunity? Again - offered as evidence, not proof.
"where is it written that like thinkers make great companions?"
OKCupid does not judge match percentage based on similar answers. For every question on the system it will ask for your answer and your ideal mate's answer. The system knows that some things are important to some people - but maybe not in reverse.
"By algorithmically matching based on answers to questions, you must assume that those answers are meaningful. And meaningful means CONSISTENT. And people are NOT consistent."
In practice, the user base for OkCupid will change. And old users will reanswer questions sometimes too.
"Assuming you are even GETTING like thinkers, (which is tenuous at best) where is it written that like thinkers make great companions?"
That's not how it works. It's not as simple as "answer match questions and we hook you up with people who answered the same way". You can specify which answers your ideal match would give, even if those answers conflict with the answers you yourself give.
"most human beings make judgments based on appearances"
Which is why you upload photos to OkCupid instead of just using match percentages.
Uploading photos isn't what supposedly sets okcupid apart, right? They aren't using physical attractiveness in their matching algorithms and that is the point I was trying to make.
Your other point has been addressed above; people don't know what they want. Asking them what they want their "ideal match" to answer is as foolhardy as asking them to describe their "ideal mate" neuron by neuron.
Given a photo by itself, or a photo plus output from a matching algorithm, the photo plus output from a matching algorithm gives more information. My point is that judging physical attractiveness is (mostly!) a solved problem on dating sites, so your main competitive advantage is attacking the other problems.
People do not, in fact, know what they want. But they can make educated guesses.
Judging physical attractiveness on dating sites isn't even a partially solved problem - I invite you to name the sites that you have found that are close, as a counterexample.
How does one go about making an EDUCATED guess with NO knowledge?
To put it plainly - statistical information assimilated from meaningless data is also meaningless. I don't know what my perfect mate looks like - and no one does. If you accept that premise, you are then forced to accept that any data gathered from some person A about what their perfect mate P is like is MEANINGLESS. So statistical algorithms from any such information yield meaningless data as well; ergo okcupid cannot do what it hopes to do. QED
> I don't know what my perfect mate looks like - and no one does.
While I agree that one's guess at a perfect partner might not be 100% accurate, are you sure you have no information on this at all? Here are some OkCupid questions. Would you be equally willing to date someone regardless of how they answered these questions?
Interestingly, the universal answer set of these questions, from all of the women I've seriously dated, would be full! That is, every possible answer would be in the set.
And I am certain I am not unique. Indeed, half of the enjoyment in getting to know someone is in the mutual effort expended in trying to understand how the other person was formed: thoughts, feelings, desires, etc - and this enjoyment is independent of the actual thoughts, feelings, etc themselves.
As relationships grow, both parties discover the thoughts, feelings, desires, etc. that were perhaps, formed irrationally - and then adjust accordingly. This is called personal discovery - and is evidence enough that most people are too inconsistent and too nearsighted for an approach such as the one okcupid employs.
That's why OkCupid gives you aggregate percentages instead of ruling people out or anything. An aggregate percentage says you mostly agree with this person on the things you care about most. It doesn't mean you have perfect harmony, nothing means that, but it gives you an easier starting point.
Keep in mind that the 'Match %' is based on OKCupid.com's algorithms that number crunch against quiz answers that the users answer. This has nothing to do with 'success rate' of users on the site hooking up with other users.
Right, so compare that with the message reply rate chart at the bottom of the article. Assuming the axes haven't changed from other charts on the page (which IMHO is a pretty good assumption)...
It suggests that certain races have good matches with other races, but horrible reply rates. Reply rate may be the best metric okcupid has as to "success rate". It also suggests strong racial biases on the part of its users.
Many people claim they aren't racist when they refuse to date other races. They always say they haven't found the right <insert minority race here> guy/girl.
What I've found from personal experience, is that the bar is set much higher for those minority races.
While it's possible for people to not date anyone of a certain race because they are bigoted against that race, I think that it's more common for people to just not feel a strong physical attraction towards certain races. My theory on this is that people that haven't grown up (at least during their childhood and/or teenage years) with many people of a certain race won't end up finding most of the 'feature' characteristics of that race to be attractive (i.e. if you grow up in a mostly white neighborhood you might not find black women attractive).
I could be completely off-base here. I have no idea how the development of attraction in the brain works (and I'm sure there's probably much research into this area of psychology). But I tend to find that a lot of people (whether they admit to it or not) have a preference for certain races and not others. Sometimes people don't have a preference for their own race. I know Asian women that are not attracted at all to Asian men, but that doesn't stop them from having Asian men as friends and interacting with them. I would hardly think that they are bigoted against Asians (seeing as they are Asian themselves).
> What I've found from personal experience, is that the bar is set much higher for those minority races.
It's not necessarily minority races. There are plenty of people that are not attracted to others of their own race. 'Yellow-fever,' 'jungle-fever,' etc weren't created for 'one-off' instances (i.e. the person accused of having 'the fever' only dated a person of that race once, and doesn't have an affinity to dating persons of that race) of inter-racial dating/mating/marriage.
What I've found from personal experience, is that the bar is set much higher for those minority races.
Depends what you mean by that. Certain physical traits are correlated to race. To the extent that these traits matter to someone, one would need to set the bar higher on other traits to offset this. I hesitate to give examples since that could anger some people...
Come to think of it, the differences on the race graph are amazingly small, given how strongly race is correlated with certain religions -- (e.g. middle-eastern people are much more likely to be Muslim, and almost all Hindus are Indian.)
I think you're confusing P(member of particular race | member of particular religion) with P(member of particular religion | member of particular race).
The reason is because at that point they already decided to interact with people of a different race.
If you want a more interesting result, take a look at my other comment for a diagram that shows response rates between different races. It illustrates a greater schism that one would come to expect.
Bullshit, us pastaferians are not represented in the religious statistics! This is heracy!
On the other hand I am glad I'm not Aquarius because statistically speaking they are insignificantly less compatible with an Aquarius. His noodelieness has blessed me well!
actually because of the response of muslim women to jews being more positive than to ath/agn, I think Jewish men overall come out on top.
Combine that with Jewish men who ARE atheists, which is a good number of em, and you've got a super-match. Which all goes to prove what my mother always told me; "I love you."
I'd argue that it's quite a hack -- Trying to discern an algorithmic approach to attraction via a large data set, and create a recommendation engine to a very fuzzy problem. =)
I'd argue that it's not at all a "hack" (I really dislike HN's tendency to equate everything to a hack), but nonetheless of significant interest to those of us interested in statistical trends in large web systems and social networks.
This kind of statistical kung-fu is well worth a look even if your own interests do not lie in dating sites.
Yes, but these are the results of the hack, not the hack itself. OkCupid has a nice thing going but this is not about their software, this is about trends they've picked out of their database of dating profiles.
It's not flamebait at all. The data is all about people's responses to generic questions about what they're looking for in a date. They don't say "I don't want to date Muslims" and have that reflected in match scores. The match score is based on things like "I hate people who squeeze the toothpaste from the middle."
To put data based on 'race & religion' out in the open to me is already a thing that is not excusable (besides the lousy definition of such things as race and religion, both of which are floating, not discrete).
There is a massive pre-selection problem here (people that frequent dating sites, rather than people in general) and the link to 'hacking' is a very tenuous one.
I've seen Michael Jackson referred to as a hacker here, so I probably shouldn't be too surprised.
This is not a hack, this is some statistical analysis with input data of dubious value, pretty pictures and all, drawing conclusions that are completely off the wall.
"Jews and Agnostics get along better with people"
"Muslims of both sexes and Hindu men get along worse"
"Catholics are more universally liked than Protestants"
Please, if that isn't flamebait I really don't know what is.
Where are the control groups, where are the standard deviations and so on.
Effectively this is a dating advert for programmers.
This is infotainment at it's best, statistical noise at its worst.
I'll agree that it's not a hack - that doesn't, in my mind, preclude it from being hacking. They're two different things. For me, anyway, playing with giant data sets does not fall into the first group, but does fall into the second.
That said, "is it hacking or not" isn't really a terribly interesting meta-discussion to have, imho, so I'll try and go in a different direction.
Please, if that isn't flamebait I really don't know what is.
The data appears to support the assertions you are calling flamebait. Provided their algorithm for calculating likeability actually works, then the data does support the assertions[1]. (This is all, of course, within the context of the pre-selection you noted in your post).
[1] Note that the assertions are of the form "Members of Set X are more likeable to members of Set Y than members of Set Z". If the assertions were "Being a member of Set X makes you more likeable to members of Set Y", they wouldn't necessarily be supported by the data.
If you're going to make this kind of generalizing statement about large groups of people I simply think that you have the responsibility to do it properly, so with control groups and so on, or not at all, otherwise the results are totally meaningless.
The way it is presented right now is as if simply the size of OkCupid and their data gathering methods give them the license to make this kind of claim.
It's 'interesting' but not 'rigorous'.
The three I listed above are particularly galling, I really don't think any computing algorithm can give someone license to make the statements listed above (and as an atheist I have no dog in that fight), but without proper methods it's even worse.
I think that you're assigning more meaning to the words "like" and "get along with" than the authors intend. Pretty much everywhere they use one of those phrases it should be qualified like "X _say they_ like Y". But that gets tedious, and I think the authors are also making the assumption that you were paying attention when they explained where these numbers come from.
Note that the end of the article is leading directly into the objections that people keep making along the lines of "this doesn't mean that these people really get along in real life", and the conflict between what people say they are looking for and the choices they actually make about who to contact and respond to.
Of course it's not rigorous. To some degree people interacting with OKCupid's site are a self-selecting bunch. You really need to take this at face value. This is just a blog post showing some number crunching on their site. There are no sensationalist headings like "Muslim Males the Most Hated Group of People." Keep in mind that this is also not a scientific journal, not peer-reviewed research... nor is it claiming to be.
It's way past my bedtime here, and judging by the moderation I'm not able to make my point, which is simply this:
If you are going to be making sweeping statements about people, even including race and religion then you really should do your homework, or if you're doing it out of curiosity, keep the results to yourself. By presenting the data in a format that looks as though very hard work went in to its creation and by hammering home the reliability of that data you are creating the illusion of something that is scientifically solid when in fact it isn't.
I agree, I'm glad to get the chance to read about this stuff, but I think the PP's point was that if you're going to present a bunch of data in that manner, it's best to say, up front, something like, "our sample, while extensive within our service, is not necessarily representative of particular ethnic groups," and point out something akin to what slashdot says about their polls: you're insane if you intend to do anything serious with this data.
You're right. It's infotainment. So? Are you incapable of taking anything at face value? They aren't making claims about how the world works; they're making claims about how people interact on their site. If that's not interesting to you, flag it and move on.
The flaw here is that how people interact on their site has very little bearing on how those same people will interact in real life.
The plural of data isn't evidence, even for large amounts of data.
Sure you can do interesting statistical analysis, but the real action is after two people have found each other, and that's where the huge flaw is in all this analysis, the statements are about interactions in real life, the data is gathered online.
Any kind of statement about people being more or less likable would have to be taken out of the context of the website and into real life, without that statements as listed above are unsupportable.
Gotta agree with you on this one. It's amusing to look at the plots and note the bits that agree (or disagree) with your prejudices, but without some information about the variance this is nothing more than entertainment.
-Hacking data to reveal the answers to important difficult questions.
-Exploring contentious issues in a neutral & intelligent way.
-Great example of how to attract potential customers by providing valuable information that also gently demonstrates competence related to your service & differentiates your offering from competitors.
I enjoyed reading this, I learnt something, and I thought favourably of the website based on this. No idea how good they are, but that post was a good marketing lesson.