A/B Testing the Effect of Sender Race on Email Response Rates

dantillberg · on Oct 15, 2012

You spammed a million people for this? Enough that over a hundred thousand people spent the time and energy to write an email reply? Please do justice to the collective time you took up with the survey.

Could you break down the data by name and/or gender of both the sender and the respondent? Test multiple hypotheses simultaneously? For example, in page 1002 of the referenced research article (http://faculty.chicagobooth.edu/marianne.bertrand/research/p...), they break down a whole slew of other qualities that may have affected response rate.

It's entirely possible that the name choice for your study has more of an effect than the perceived race association of those names. The "99.9% confidence" cited is really not that, and is subject to various biases which are hard to discern.

ohashi · on Oct 15, 2012

The spamming part really bugged me too. It seemed like he harvested these emails rather than any form of opt-in. This seems like bad science in general.

jcr · on Oct 15, 2012

I agree with you about the harmful spamming aspects being troublesome. If it was a medical trial that harmed the test participants, it would be deemed unethical. On the other hand, using opt-in would have resulted in a bias in the results. The experiment might be ethically dubious in regards to harming participants, but the design of the experiment is still logically sound. Calling it "bad science" is fair if your position is based purely on the ethical implications, but typically, the phrase "bad science" is used to describe improper experiment design and erroneous conclusions.

Eric left out a lot of wanted/needed details and data about his experiment. Due to the ethics questions involved, it would be best if his data was release and his experiment was evaluated, rather than lots of other people repeating the same or similar experiments.

The harm is already done, so let's try to learn from it.

Maybe I'm just being too forgiving?

jules · on Oct 15, 2012

Assuming on average 10 minutes was spent by the receiving end, the OP has just killed 0.24 human lives to conduct this study. Or if we value a human life at $7 million [1], the damage is about $1.7 million.

[1] http://en.wikipedia.org/wiki/Value_of_life

e-dard · on Oct 15, 2012

I get annoyed when I read things like: "A performed 4.9% better than B", when the comparison is between two proportions.

Just say it like it is, without putting a slant on it - A's response rate was 0.6% higher than B's.

I don't care if the results are significantly different, if the difference between the two samples is so small.

ezl · on Oct 15, 2012

i don't really understand why this is "putting a slant on it".

citing an absolute difference isn't very useful when you're talking about differences in conversion rates.

    * 0.6% -> 1.2% is a 0.6% bump.
    * 49.4% -> 50.0% is a 0.6% bump.

however, in the first case, you're talking about doubling your conversion rate.

it sounds like you're saying that if someone increases their landing page conversion rate by 1% (from 1%) you'd rather hear that the conversion rate increased by a point.

the business owner is probably more interested in the fact that their revenues doubled.

jmduke · on Oct 15, 2012

While I don't disagree with you, this is a false dichotomy. A proper analysis of the findings should have reported -- and explained -- both absolute and relative differences.

hnr · on Oct 15, 2012

The article shouldn't have to spell out absolute differences in English sentences. The article mentions the important part (relative difference) in English sentences. The article includes the chart of the response rate by race the reader can see everything: race response rate black 11.01% white 12.32% hispanic 12.92%

hnr · on Oct 15, 2012

This is an important point concerning Effect Sizes: "I don't care if the results are significantly different, if the difference between the two samples is so small". People do get too wrapped up in statistical significance and forget about practical significance.

However, the relative difference (4.9%) is the relevant metric to be looking at as noted by "ezl" in a comment.

jcr · on Oct 15, 2012

Eric, the write-up is great, but it would be better if you provided the supporting data. The research is interesting, but sending out a million emails with tracking bugs is a bit, umm, questionable when one considers the time/effort wasted by the recipients. If everyone ran similar experiments, it would make a real mess, so providing the data you collected could also be beneficial in reducing the load.

hammock · on Oct 15, 2012

Applaud you for doing the research, and I'll probably end up referencing it in my work at some point. Surely there are a number of holes to poke (as with anything) the one that stands out to me at the moment is you didn't control for the race of the recipient. I.e. if your overall recipient list was 50% Hispanic, even though you randomized who got what, would still expect Hispanic-sent open rate to be higher.

ezl · on Oct 15, 2012

op here. i should own up. this is an apology.

@dantillberg, @jcr, et al: you're right.

i heard about the original study, was curious, but not enough to think much of it. in a previous startup we had a female intern who was getting substantially better response rates than the male founders.

after the recent press about how women in senior roles correlates with startup success i became a bit more curious and wondered if i could craft the perfect "from" field for outgoing emails.

i admit this was aggressive and that I got carried away. that's no excuse.

rdwallis · on Oct 15, 2012

I assume the author was just trying to set up the premise before getting to the meat of the article but the opening paragraph claim that Americans are more sensitive about discrimination than anybody else is probably false and more than a little ironic.

biznickman · on Oct 15, 2012

File this one under "tests that shouldn't have been conducted in the first place"

slig · on Oct 15, 2012

PC aside, why not?

jpadvo · on Oct 15, 2012

TLDR: Because you can't act on that data without acting in a racist way that makes the world a worse place. There is no way to use that data without really bad behavior.

Let's break down why it is such an atrocious idea.

Say you have a black man in a high ranking position in your company. Oh, but you want to sieze every advantage possible, and you have data suggesting that black people get lower reply rates on emails. And women get higher reply rates on emails. How do you deal with this situation?

Should he have someone else in the company send high-priority emails for him? Or maybe just set up a dummy email for him with a female hispanic name, and pretend it is his "secretary"? Or maybe he shouldn't be in the kind of position in the first place -- maybe he should be in a more internally focused position? Maybe next time you hire, you should be on the lookout for a hispanic woman for an externally focused position, to make things easier.

There are a huge number of problems with these kinds of actions, starting with the fact that you SHOULD NOT be "dealing with" the situation that you have an employee of a certain race or gender.

1. Flat out discriminatory -- forcing someone to jump through hoops because of skin color / gender.

2.a Fundamentally strikes at human dignity by treating everyone you email as racist, genderist animals.

2.b Fundamentally strikes at human dignity by telling your employees to pretend to be someone "more palatable". Or by having someone do a task, lets say, because the individual is in possession of a female hispanic name. Everyone is dehumanized.

3. Liable to twist thinking. After reading that article, and next time you're hiring someone, you brain has a good chance of (inadvertantly) jumping to the notion of how the applicant's race / gender is going to impact email reply rates. We've agreed as a society, for good and just reasons, that hiring decisions and compensation decisions should be completely blind to those (and other) factors.

4. Any actions taken to hide or shield people who seem to be less acceptable or attractive to society simple reinforces their position. The idea that there aren't successful [insert race / gender] people in [profession] is reinforced by hiding those who are.

lobotryas · on Oct 15, 2012

You seem to be operating under the assumption that results from all studies will be acted upon by everyone who reads them. This assumption undermines the rest of your points because the chances of someone making a high-profile hiring decision based on potential email response rates is laughably small. Also, I'm looking forward to your "shouldn't be done" condemnation of the original study (http://faculty.chicagobooth.edu/marianne.bertrand/research/p...) that prompted this blog post.

What's scary is tacit admission that you'd rather censor scientific research than face some uncomfortable truths about human nature. What other studies about human psychology would you censor?

>How do you deal with this situation?

You don't, at least not in a liberal state like California. If I may be realistic for a moment: here, having a high-ranking employee who's hispanic-female-gay-disabled-atheist is an advantage.

>We've agreed as a society, for good and just reasons, that hiring decisions and compensation decisions should be completely blind to those (and other) factors.

This is either naive or delusional, because a lot of people seem to have missed the memo. As much as we may want to be impartial, personal preference will always play a part because we're humans instead of robots.

Moreover, there's the grey area called "team fit". I may hire Juan (Hispanic male) because he's very friendly and helpful, but has less experience than Jason (Caucasian male) who's very experienced but cold and aloof. If you look at this strictly from a technical qualifications point of view, then I discriminated against Jason. If you consider the big picture, then you see I made the right hiring decision.

jpadvo · on Oct 15, 2012

That's a really good question. I should have been much more clear about exactly what I was talking about. It wasn't about the knowledge gained, it was about everything surrounding that.

What I wrote was in response to an article that carried the subtitle "Are Emily and Brendan More Employable than Lakisha and Jamal?"

And included, just above the fold, these gems:

"Question: How can I use societal prejudices to aid in startup success?"

"Race for Internet Marketing and Startups"

There are many fascinating and extremely valuable studies treating race. You are right, they are often uncomfortable. I love those studies. But this post is not that. It not only has shoddy methodology and would get blasted to pieces by peer review, but its expressed purpose is the application of racism and sexism to the world of startups. THAT is what I was writing about.

tzs · on Oct 15, 2012

So your position would be that a black businessman with an unusual name is better off not knowing that this name is putting him at a disadvantage compared to businessmen (white AND black) with more conventional names?

sadga · on Oct 15, 2012

AirBnB owes its multi-multi-milliondollar growth to an email campaign that used fake women's names as the sender.

picasso81 · on Oct 16, 2012

Ha! If only it was that easy. What a funny misconception.

tzs · on Oct 15, 2012

You should have also included unconventional names not usually associated with black people, such as Moon Unit, Starshine, Whalesong, and other such "hippy" names, or names associated with poor rural white people.

jarin · on Oct 15, 2012

I would also like to see the effect of East Indian names. My name isn't from India (it is from Thailand), but it is often mistaken for being Indian or Arabic.

reinhardt · on Oct 15, 2012

Offtopic but I did a double take on this: "Former options trader. Passionate QBASIC developer." Subtle irony or what?

jere · on Oct 15, 2012

Frankly, I'm not really surprised that Antonio Banderas commanded a higher response rate.